Converting HEX values to Unicode characters

Question

I have a small bot for social network, which accepts message "unicode XXXX XXXX XXXX ...", where XXXX - HEX values which I should convert to Unicode characters. Initially command accepted only one HEX value, and it worked fine and fast. After I modified code to accept multiple values, process slowed down (1-2 seconds response vs. 4-5 seconds).

Here is my code:

def unic(msg): if type(msg) == list: msg.pop(0) # removing "unicode" word try: sym = '' # final result correct = [] # correct HEX values to be converted for code in msg: try: chr(int(code, 16)) correct.append(code) except: pass if correct != []: for code in correct: sym = sym + chr(int(code, 16)) elif correct == []: return c_s.get('incorrect') # returning error message from common strings list if sym != '': return sym except: return c_s.get('incorrect')

What should I change here to accelerate process? Any suggestions are welcome.

Note that 'hex' is short for 'hexadecimal' — there's no reason to write it in capital letters. — deltab, CommentedNov 19, 2016 at 0:30
That's a hugely greater time difference than I'd expect from simple algorithmic code (no database lookups, etc.). I suspect the extra time is mostly your system having to load and render uncached parts of a font. — deltab, CommentedNov 19, 2016 at 1:22

Peilonrayz · Accepted Answer · 2016-11-19 02:12:07Z

Only use try-except when you need to. Your outer try should be useless. You should also perform chr(int(code, 16)) once, as there is no gain to running it twice, it just costs cycles.

Your output is horrible, if there is no valid input you go from explicitly silencing it to returning an error. And if somehow you have an array of items, but they don't get added to sym you return None. You need to pick how your function works. Either remove invalid characters, or raise errors.

Doing the above removes half your code. I'd also change except to catch certain errors, e.g. ValueError, otherwise your program is prone to mask bugs.

Strings are immutable and so sym = sym + chr(int(code, 16)) can take a very long amount of time. Instead build a list and ''.join it.

Finally return c_s.get('incorrect') is a massive red flag, remove these and raise exceptions instead.

This can get you:

def unicode_(msg): new_msg = [] for char in msg: try: char = chr(int(char, 16)) except ValueError: char = '?' new_msg.append(char) return ''.join(new_msg)

@MaxLunar No problem! Just so you know accepting my answer so early discourages others to post answers, and you may miss out on a better answer than the one I provided. :) — Peilonrayz, CommentedNov 18, 2016 at 16:09
That code already looks fine, and I have tested it in work. It does work as fast as single HEX value converter. — MaxLunar, CommentedNov 18, 2016 at 16:16
Not sure what you mean by "it can sometimes take a little while" — more than half a microsecond? — deltab, CommentedNov 19, 2016 at 0:46
@deltab What did you try? I tried abcd, and it took half a second. — Peilonrayz, CommentedNov 19, 2016 at 0:50
@Peilonrayz Did you also output the character? It's a CJK character and if you haven't rendered any CJK characters recently, your system would have to load uncached parts of a font file. Try doing it again and compare how long it takes each time. — deltab, CommentedNov 19, 2016 at 1:16

Stack Exchange Network

Converting HEX values to Unicode characters

1 Answer 1

Hot Network Questions

Converting HEX values to Unicode characters

1 Answer 1

Related

Hot Network Questions