How to convert a string to UTF-16 format in google app script

Paul Clinton Source

I am writing a google app script which converts a string to UTF-16 unicode format. For example Input:Hello World Output:\u0048\u0065\u006c\u006c\u006f \u0057\u006f\u0072\u006c\u0064

I actually want the script to convert the column containing Arabic words in the goggle doc spreadsheet to UTF-16 format. Like-

Input: مرحبا بالعالم
Output: \u0645\u0631\u062d\u0628\u0627 \u0628\u0627\u0644\u0639\u0627\u0644\u0645

Is there any way I could do this in Google app script ? If yes, please point me to the right direction on ways to doing it.

google-apps-scriptgoogle-spreadsheet

Answers

answered 1 month ago pnuts #1

SO is not the place to have a complete script written for you from scratch but a formula might help you get started:

=TEXTJOIN(,,ArrayFormula(lower("\u0"&DEC2HEX(CODE(SPLIT(regexreplace(A1,"(\D)","$1\"),"\"))))))

The above though recognises a space.

REGEXREPLACE here 'captures' (the ( )) each individual non digit character (the class \D) in A1 and appends to each element of the captured group ($1) a backslash. SPLIT parses the result of REGEXREPLACE at each \. CODE converts the characters into decimal map values which DEC2HEX then converts to signed hexadecimal format for appending to \u0 with the concatenation operator &. LOWER converts the alphabetic elements returned by DEC2HEX as capitals into lower case. SPLIT created an array so ARRAYFORMULA is required for the functions to process all the individual elements (eg DEC2HEX is a non-array function). TEXTJOIN then stitches all the pieces together and is used with defaults for the first two parameters.

comments powered by Disqus