Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
don't encode sms when calculating content length
we were previously encoding the utf-8 content, and measuring the number of bytes. We've been doing this since the beginning of time, and for most of time, it wasn't an issue. However, when we started accepting unicode characters (notably some welsh-only accents eg Ŵ or ŷ), they are encoded as two bytes rather than 1. So a message of 70 ŷ characters is 140 bytes long, but a message of 69 a characters and one ŷ is 71 characters long. However, in reality, when we send a message, if _any_ character in it is non-gsm (similar but not the same as ascii), then the entire thing is encoded in UTF-16, where every single character (including basic latin text) is two bytes long. So both ŷŷŷŷ... and aaaaa....aŷ are actually 70 utf-16 characters - we already take account of the double code point width when determining billable fragments. This means in a small amount of cases (where someone has a message that is almost at a message boundary in size, which contains a lot of welsh characters), we've over-billed that service. But this is probably an incredibly low amount of messages, and impossible to account for after 1 wk, so probably not worth worrying about.
- Loading branch information