How To Calculate Byte Length Containing Utf8 Characters Using Javascript?
Solution 1:
As of 2018, the most compatible and reliable way of doing this seems to be with the blob api.
new Blob([str]).size
Even supported in IE10 if anyone uses that anymore.
Solution 2:
The experimental TextEncoder API can be used for this but is not supported by Internet Explorer or Safari:
(newTextEncoder()).encode("i ♥ u i ♥ u i ♥ u i ♥ u i ♥ u").length;
Another alternative is to URI-encode the string and count characters and %-encoded escape sequences, as in this library:
~-encodeURI("i ♥ u i ♥ u i ♥ u i ♥ u i ♥ u").split(/%..|./).length
The github page has a compatibility list which unfortunately does not include IE10, but IE9.
Since I can not yet comment, I'll also note here that the solution in the accepted answer does not work for code points consisting of multiple UTF-16 code units.
Solution 3:
Counting UTF8 bytes comes up quite a bit in JavaScript, a bit of looking around and you'll find a number of libraries (here's one example: https://github.com/mathiasbynens/utf8.js) that can help. I also found a thread (https://gist.github.com/mathiasbynens/1010324) full of solutions specifically for utf8 byte counts.
Here is the smallest, and most accurate function out of that thread:
function countUtf8Bytes(s){
var b =0, i =0,cfor(;c=s.charCodeAt(i++);b+=c>>11?3:c>>7?2:1);
return b
}
Note: I rearranged it a bit so that the signature is easier to read. However its still a very compact function that might be hard to understand for some.
You can check its results with this tool: https://mothereff.in/byte-counter
One correction to your OP, the example string you provided i ♥ u
is actually 7 bytes, this function does count it correctly.
Post a Comment for "How To Calculate Byte Length Containing Utf8 Characters Using Javascript?"