7
\$\begingroup\$

Background

I got this interview test and got declined due to not meeting their expectations, but never got a reason on what was bad and how they solved it. I want to improve and learn something out of it.

Question

How can this be better written and what is badly written?

Assignment description

To optimize the amount of data sent between the web browser to the web server a "clever" Javascript developer came up with the idea to translate JSON objects into binary format in the application and send them to the server. Faced with the fact that the Javascript is released in its final version to the customer it is now your task to develop the parser on the back end system.

A JSON object is a hierarchy of key-value pairs where a value in its turn can contain new key-value pairs. It consists of four basic types: numbers, strings, arrays and dictionaries. An array is a list of values and a dictionay is a list of key-value pairs. The key can only be of the type number or string while a value can be of any type.

A JSON object always starts with a value.

An example JSON object can look like:

{ 'firstName': 'John', 'lastName': 'Smith', 'age': 25, 'address': { 'streetAddress': '21 2nd Street', 'city': 'New York', 'state': 'NY', 'postalCode': '10021' }, 'phoneNumber': [ { 'type': 'home', 'number': '212 555-1234' }, { 'type': 'fax', 'number': '646 555-4567' } ] } 

A number is printed in decimal without any decoration. Example: 25

A string is printed in ASCII with single quotes in the start and end of the string. Example: 'test'

A key-value pair is printed as key followed by colon (:), a space ( ) and the value. Example: 'a': 67

A dictionary starts and ends with curly brackets ({ and }) and then has a comma (,) separated list of key-value pairs. Example: { 'name': 'Joe', 'age': 31 }

An array starts and ends with square brackets ([ and ]) and then has a comma (,) separated list of values. Example: [ 'hello', 56, 'world' ]

The binary representation of the JSON object contains a one byte identifier that describes the type of the data to follow and is then immediately followed by the data.

The identifiers and their types are as follows:

Identifier Type Description 0x01 Number 4 bytes signed integer in big endian byte order. 0x02 String N ASCII characters terminated by 0x00. 0x05 List Amount of items as a number followed by N values 0x06 Dictionary Amount of items as a number followed by N key-value pairs 

The program's task is to parse a binary file and prints it as human readable text. It should read the data from standard input and writes it the result to standard output.

Look at the files 'input_x' and their respective 'result_x' for examples of input and output. More background can be found on e.g. www.json.org

Input_4 binary

link to input_4 on gist

My solution

public class Main { private static String INPUT_FILENAME = "input_4"; private static String OUTPUT_FILENAME = "result_4"; private static String RESOURCE_INPUT_PATH = "src/main/resources/input/"; private static String RESOURCE_OUTPUT_PATH = "src/main/resources/output/"; public static void main(String[] args) { File resourcesDirectory = new File(String.format("%s%s", RESOURCE_INPUT_PATH, INPUT_FILENAME)); File file = new File(resourcesDirectory.getAbsolutePath()); try { byte[] byteArray = Files.readAllBytes(file.toPath()); RecursiveParser recursiveParser = new RecursiveParser(); try { String result = recursiveParser.parse(byteArray, 0, false).toString(); String prettyPrinted = prettyPrint(result); BufferedWriter writer = new BufferedWriter( new FileWriter( new File( String.format("%s%s%s", RESOURCE_OUTPUT_PATH, OUTPUT_FILENAME, ".json") ) ) ); writer.write(prettyPrinted); writer.close(); } catch (JSONException e) { e.printStackTrace(); } } catch (IOException e) { e.printStackTrace(); } } private static String prettyPrint(String data) throws JSONException { Object json = new JSONTokener(data).nextValue(); if (json instanceof JSONObject){ return (new JSONObject(data)).toString(4); } else if (json instanceof JSONArray){ return (new JSONArray(data)).toString(4); } else { return data; // nothing to pretty print } } } 
class RecursiveParser { private static int TERMINATE = 0x00; private static int NUMBER = 0x01; // Number 4 bytes signed integer in big endian byte order. private static int STRING = 0x02; // String N ASCII characters terminated by 0x00. private static int LIST = 0x05; // List Amount of items as a number followed by N values private static int DICTIONARY = 0x06; // Dictionary Amount of items as a number followed by N key-value pairs Object parse(byte[] byteArray, int index, boolean hasSub) throws JSONException { for(; index < byteArray.length; index++){ if(byteArray[index] == NUMBER){ return getNumber(byteArray, index); } if(byteArray[index] == STRING){ return getString(byteArray, index); } if(byteArray[index] == LIST){ return getList(byteArray, index, hasSub); } if(byteArray[index] == DICTIONARY){ return getDictionary(byteArray, index, hasSub); } } return null; // should never get here } private Object getDictionary(byte[] byteArray, int index, boolean hasSub) throws JSONException { index++; // move to size after type because dictionary size int dictionarySize = (int)parse(byteArray, index, hasSub); index += ByteBuffer.allocate(4).putInt(dictionarySize).array().length; JSONWriter jsonWriter = new JSONStringer() .object(); for(int i = 0; i < dictionarySize; i++){ index++; Object key = parse(byteArray, index, hasSub); int keyLength = 0; if(key instanceof Integer){ jsonWriter.key(String.valueOf(key)); keyLength += ByteBuffer.allocate(4).putInt((Integer) key).array().length; } else if(key instanceof String) { jsonWriter.key(String.valueOf(key)); keyLength += ((String) key).getBytes().length + 1; } index += keyLength + 1; // check if sub-array or sub-dictionary hasSub = hasSub || (byteArray[index] == DICTIONARY || byteArray[index] == LIST); Object value = parse(byteArray, index, hasSub); int valueLength = 0; if (value instanceof Integer) { jsonWriter.value(value); valueLength += ByteBuffer.allocate(4).putInt((Integer) value).array().length; } else if (value instanceof String) { jsonWriter.value(value); valueLength += String.valueOf(value).getBytes().length + 1; } else if (value instanceof AbstractMap.SimpleEntry) { valueLength = (int) ((AbstractMap.SimpleEntry) value).getKey() - index; jsonWriter.value(((AbstractMap.SimpleEntry) value).getValue()); } index += valueLength; } jsonWriter .endObject(); return hasSub && index != (byteArray.length - 1) ? new AbstractMap.SimpleEntry<>(index, new JSONObject(jsonWriter.toString())) : new JSONObject(jsonWriter.toString()); } private Object getList(byte[] byteArray, int index, boolean hasSub) throws JSONException { index++; // move to size after type because list size int listSize = (int)parse(byteArray, index, hasSub); index += ByteBuffer.allocate(4).putInt(listSize).array().length; JSONWriter jsonWriter = new JSONStringer().array(); for(int i = 0; i < listSize; i++){ index++; // check if sub-array or sub-dictionary hasSub = hasSub || byteArray[index] == DICTIONARY || byteArray[index] == LIST; Object value = parse(byteArray, index, hasSub); int valueLength = 0; if (value instanceof Integer) { jsonWriter.value(value); valueLength += ByteBuffer.allocate(4).putInt((Integer) value).array().length; } else if (value instanceof String) { jsonWriter.value(value); valueLength += String.valueOf(value).getBytes().length + 1; } else if (value instanceof AbstractMap.SimpleEntry) { valueLength = (int) ((AbstractMap.SimpleEntry) value).getKey() - index; jsonWriter.value(((AbstractMap.SimpleEntry) value).getValue()); } index += valueLength; } jsonWriter.endArray(); return hasSub && index != (byteArray.length - 1) ? new AbstractMap.SimpleEntry<>(index, new JSONArray(jsonWriter.toString())) : new JSONArray(jsonWriter.toString()); } private String getString(byte[] byteArray, int index) { int start = index + 1; // move to next value after type StringBuilder value = new StringBuilder(); for(int i = start; i < byteArray.length; i++){ if(byteArray[i] == TERMINATE){ break; } value.append((char)byteArray[i]); } return value.toString(); } private int getNumber(byte[] byteArray, int index) { int start = index + 1; // move to next value after type int offset = start + 4; byte[] numberByteArray = Arrays.copyOfRange(byteArray, start, offset); return new BigInteger(numberByteArray).intValue(); } } 

Result_4 output

{ "5": 25, "deep": { "1": "integer as key", "2": {"4": 19088743}, "mix": "it is possible to mix integers and strings" }, "first": 16777216, "second": "value for second" } 
\$\endgroup\$
2
  • \$\begingroup\$Would you mind adding additional examples? I don't quite follow the transformation from Input_4 to Result_4. Where is, for example, "5": 25 coming from?\$\endgroup\$
    – ggorlen
    CommentedMar 13, 2019 at 23:14
  • \$\begingroup\$It seems codereview strips the values I pasted from the binary and only keept the text ones. I'll try update the Input_4 binary or provide a link instead.\$\endgroup\$
    – Rovdjuret
    CommentedMar 14, 2019 at 7:41

1 Answer 1

7
\$\begingroup\$

There are two points that showed up immediately when looking at your implementation:

  • You missed one of the goals of the task: It should read the data from standard input and writes it the result to standard output.
  • You are reading in the complete binary data into memory and process it from there.

The latter is a no go, if we're talking about large structures of JSON-data, e.g. containing Base64-encoded binary data of a DVD-image (don't ask ;-) It might be unlikely in a real world example but it might let them come to the conclusion that you're not familiar with stream based processing and had another one how showed that ability which might have led to their decision.

Some remarks about the actual code:

Your reading-number-implementation:

private int getNumber(byte[] byteArray, int index) { int start = index + 1; // move to next value after type int offset = start + 4; byte[] numberByteArray = Arrays.copyOfRange(byteArray, start, offset); return new BigInteger(numberByteArray).intValue(); } 

The specification said that a number is a signed integer in big endian order which is exactly how a Java int is defined, so instead of creating temporary arrays and a BigInteger you could simply have used an int and used bit-shifting:

private int getNumber(byte[] byteArray, int index) { int ret = byteArray[index + 4] & 0xff; ret |= (byteArray[index + 3] & 0xff) < 8; ret |= (byteArray[index + 2] & 0xff) < 16; ret |= (byteArray[index + 1] & 0xff) < 24; return ret; } 

If you had implemented a stream based processing and used a DataInputStream the implementation would have been

private int getNumber(DataInputStream source) { return source.readInt(); } 

Your reading-text-implementation:

private String getString(byte[] byteArray, int index) { int start = index + 1; // move to next value after type StringBuilder value = new StringBuilder(); for(int i = start; i < byteArray.length; i++){ if(byteArray[i] == TERMINATE){ break; } value.append((char)byteArray[i]); } return value.toString(); } 

Not much that can be changed here but for good measure, I'd change (char)byteArray[i] to (char) (byteArray[i] & 0xff). For ASCII-characters that's irrelevant but still ;-)

In getDictionary:

private Object getDictionary(byte[] byteArray, int index, boolean hasSub) throws JSONException { [...] index += ByteBuffer.allocate(4).putInt(dictionarySize).array().length; 

That's a very elaborate form of

index += 4; 

because that's the number of bytes a signed integer is defined in the given specification.

keyLength += ((String) key).getBytes().length + 1; 

getBytes() uses the system's file-encoding. Because this particular example uses ASCII you won't see any effect on 99% of all systems but it would break on a system that runs e.g. with UTF-16 as file-encoding. You can see that youself by starting your Java test app with the system property -Dfile.encoding=UTF16.

This is a common Beginner's Error which might have raised a flag with them as well.

Always, ALWAYS use getBytes(encoding) unless you really want to use the system's file encoding for some reason.

hasSub = hasSub || (byteArray[index] == DICTIONARY || byteArray[index] == LIST); [...] } else if (value instanceof AbstractMap.SimpleEntry) { valueLength = (int) ((AbstractMap.SimpleEntry) value).getKey() - index; jsonWriter.value(((AbstractMap.SimpleEntry) value).getValue()); } 

This code block is duplicated in getList, you should put that into its own method and call it from the two methods. Same is true for the logic with the return-statement. This should be put into its own method so you only need to fix it once in case you find a bug in it.

General stuff:

You had to cope with the fact that your get-methods had to return a value and should change the value of the index. That's not possible, so you decided to change the index value in the calling method in dependence of the type of the returned parsed value. This is "not optimal" to say the least.

\$\endgroup\$
4
  • \$\begingroup\$Thank you very much for your review, I see now that this wasn't one of my greatest moments such as missing a obvious requirement. I learned a lot from your review and will bring it into my future assignments. Really appreciate it, cheers!\$\endgroup\$
    – Rovdjuret
    CommentedMar 13, 2019 at 17:28
  • \$\begingroup\$Also, your solution is just a standalone program. Implementing it as a class that can be directly imported into a project and having a separate unit tests probably would have helped. As to code style, it never hurts to mark your fields final and always throw an exception from code that should never be reached.\$\endgroup\$CommentedMar 14, 2019 at 8:14
  • \$\begingroup\$Probably should have started a separate answer, but eh, here we go. Using FilterOutputStream and InputStream.transferto(...) would have shown the interviewer your knowledge about the standard APIs.\$\endgroup\$CommentedMar 14, 2019 at 8:23
  • \$\begingroup\$@TorbenPutkonen Can you make it a proposed answer, please? :)\$\endgroup\$
    – Rovdjuret
    CommentedMar 14, 2019 at 9:01

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.