Recursive parser from Binary to JSON output












8












$begingroup$


Background



I got this interview test and got declined due to not meeting their expectations, but never got a reason on what was bad and how they solved it. I want to improve and learn something out of it.



Question



How can this be better written and what is badly written?



Assignment description



To optimize the amount of data sent between the web browser to the web server
a "clever" Javascript developer came up with the idea to translate JSON objects
into binary format in the application and send them to the server. Faced with
the fact that the Javascript is released in its final version to the customer
it is now your task to develop the parser on the back end system.



A JSON object is a hierarchy of key-value pairs where a value in its turn can
contain new key-value pairs. It consists of four basic types: numbers, strings,
arrays and dictionaries. An array is a list of values and a dictionay is a list
of key-value pairs. The key can only be of the type number or string while a
value can be of any type.



A JSON object always starts with a value.



An example JSON object can look like:



{
'firstName': 'John',
'lastName': 'Smith',
'age': 25,
'address': {
'streetAddress': '21 2nd Street',
'city': 'New York',
'state': 'NY',
'postalCode': '10021'
},
'phoneNumber': [
{ 'type': 'home', 'number': '212 555-1234' },
{ 'type': 'fax', 'number': '646 555-4567' }
]
}


A number is printed in decimal without any decoration.
Example: 25



A string is printed in ASCII with single quotes in the start and end of the
string.
Example: 'test'



A key-value pair is printed as key followed by colon (:), a space ( ) and the
value.
Example: 'a': 67



A dictionary starts and ends with curly brackets ({ and }) and then has a
comma (,) separated list of key-value pairs.
Example: { 'name': 'Joe', 'age': 31 }



An array starts and ends with square brackets ([ and ]) and then has a
comma (,) separated list of values.
Example: [ 'hello', 56, 'world' ]



The binary representation of the JSON object contains a one byte identifier
that describes the type of the data to follow and is then immediately followed
by the data.



The identifiers and their types are as follows:



Identifier  Type             Description
0x01 Number 4 bytes signed integer in big endian byte order.
0x02 String N ASCII characters terminated by 0x00.
0x05 List Amount of items as a number followed by N values
0x06 Dictionary Amount of items as a number followed by N
key-value pairs


The program's task is to parse a binary file and prints it as human readable
text. It should read the data from standard input and writes it the result to
standard output.



Look at the files 'input_x' and their respective 'result_x' for examples of
input and output. More background can be found on e.g. www.json.org



Input_4 binary



link to input_4 on gist



My solution



public class Main {

private static String INPUT_FILENAME = "input_4";
private static String OUTPUT_FILENAME = "result_4";

private static String RESOURCE_INPUT_PATH = "src/main/resources/input/";
private static String RESOURCE_OUTPUT_PATH = "src/main/resources/output/";

public static void main(String args) {
File resourcesDirectory = new File(String.format("%s%s", RESOURCE_INPUT_PATH, INPUT_FILENAME));
File file = new File(resourcesDirectory.getAbsolutePath());

try {
byte byteArray = Files.readAllBytes(file.toPath());

RecursiveParser recursiveParser = new RecursiveParser();

try {
String result = recursiveParser.parse(byteArray, 0, false).toString();

String prettyPrinted = prettyPrint(result);
BufferedWriter writer = new BufferedWriter(
new FileWriter(
new File(
String.format("%s%s%s", RESOURCE_OUTPUT_PATH, OUTPUT_FILENAME, ".json")
)
)
);
writer.write(prettyPrinted);

writer.close();
} catch (JSONException e) {
e.printStackTrace();
}
} catch (IOException e) {
e.printStackTrace();
}
}

private static String prettyPrint(String data) throws JSONException {
Object json = new JSONTokener(data).nextValue();
if (json instanceof JSONObject){
return (new JSONObject(data)).toString(4);
}
else if (json instanceof JSONArray){
return (new JSONArray(data)).toString(4);
} else {
return data; // nothing to pretty print
}
}
}


class RecursiveParser {
private static int TERMINATE = 0x00;

private static int NUMBER = 0x01; // Number 4 bytes signed integer in big endian byte order.
private static int STRING = 0x02; // String N ASCII characters terminated by 0x00.
private static int LIST = 0x05; // List Amount of items as a number followed by N values
private static int DICTIONARY = 0x06; // Dictionary Amount of items as a number followed by N key-value pairs

Object parse(byte byteArray, int index, boolean hasSub) throws JSONException {
for(; index < byteArray.length; index++){
if(byteArray[index] == NUMBER){
return getNumber(byteArray, index);
}
if(byteArray[index] == STRING){
return getString(byteArray, index);
}
if(byteArray[index] == LIST){
return getList(byteArray, index, hasSub);
}
if(byteArray[index] == DICTIONARY){
return getDictionary(byteArray, index, hasSub);
}
}
return null; // should never get here
}

private Object getDictionary(byte byteArray, int index, boolean hasSub) throws JSONException {
index++; // move to size after type because dictionary size

int dictionarySize = (int)parse(byteArray, index, hasSub);
index += ByteBuffer.allocate(4).putInt(dictionarySize).array().length;

JSONWriter jsonWriter = new JSONStringer()
.object();

for(int i = 0; i < dictionarySize; i++){
index++;

Object key = parse(byteArray, index, hasSub);

int keyLength = 0;
if(key instanceof Integer){
jsonWriter.key(String.valueOf(key));
keyLength += ByteBuffer.allocate(4).putInt((Integer) key).array().length;
} else if(key instanceof String) {
jsonWriter.key(String.valueOf(key));
keyLength += ((String) key).getBytes().length + 1;
}

index += keyLength + 1;
// check if sub-array or sub-dictionary
hasSub = hasSub || (byteArray[index] == DICTIONARY || byteArray[index] == LIST);

Object value = parse(byteArray, index, hasSub);

int valueLength = 0;
if (value instanceof Integer) {
jsonWriter.value(value);
valueLength += ByteBuffer.allocate(4).putInt((Integer) value).array().length;
} else if (value instanceof String) {
jsonWriter.value(value);
valueLength += String.valueOf(value).getBytes().length + 1;
} else if (value instanceof AbstractMap.SimpleEntry) {
valueLength = (int) ((AbstractMap.SimpleEntry) value).getKey() - index;
jsonWriter.value(((AbstractMap.SimpleEntry) value).getValue());
}

index += valueLength;
}

jsonWriter
.endObject();

return hasSub && index != (byteArray.length - 1) ? new AbstractMap.SimpleEntry<>(index, new JSONObject(jsonWriter.toString())) : new JSONObject(jsonWriter.toString());
}

private Object getList(byte byteArray, int index, boolean hasSub) throws JSONException {
index++; // move to size after type because list size

int listSize = (int)parse(byteArray, index, hasSub);
index += ByteBuffer.allocate(4).putInt(listSize).array().length;

JSONWriter jsonWriter = new JSONStringer().array();
for(int i = 0; i < listSize; i++){
index++;

// check if sub-array or sub-dictionary
hasSub = hasSub || byteArray[index] == DICTIONARY || byteArray[index] == LIST;

Object value = parse(byteArray, index, hasSub);

int valueLength = 0;
if (value instanceof Integer) {
jsonWriter.value(value);
valueLength += ByteBuffer.allocate(4).putInt((Integer) value).array().length;
} else if (value instanceof String) {
jsonWriter.value(value);
valueLength += String.valueOf(value).getBytes().length + 1;
} else if (value instanceof AbstractMap.SimpleEntry) {
valueLength = (int) ((AbstractMap.SimpleEntry) value).getKey() - index;
jsonWriter.value(((AbstractMap.SimpleEntry) value).getValue());
}

index += valueLength;
}
jsonWriter.endArray();

return hasSub && index != (byteArray.length - 1) ? new AbstractMap.SimpleEntry<>(index, new JSONArray(jsonWriter.toString())) : new JSONArray(jsonWriter.toString());
}

private String getString(byte byteArray, int index) {
int start = index + 1; // move to next value after type
StringBuilder value = new StringBuilder();
for(int i = start; i < byteArray.length; i++){
if(byteArray[i] == TERMINATE){
break;
}
value.append((char)byteArray[i]);
}
return value.toString();
}

private int getNumber(byte byteArray, int index) {
int start = index + 1; // move to next value after type
int offset = start + 4;

byte numberByteArray = Arrays.copyOfRange(byteArray, start, offset);

return new BigInteger(numberByteArray).intValue();
}
}


Result_4 output



{
"5": 25,
"deep": {
"1": "integer as key",
"2": {"4": 19088743},
"mix": "it is possible to mix integers and strings"
},
"first": 16777216,
"second": "value for second"
}









share|improve this question











$endgroup$












  • $begingroup$
    Would you mind adding additional examples? I don't quite follow the transformation from Input_4 to Result_4. Where is, for example, "5": 25 coming from?
    $endgroup$
    – ggorlen
    Mar 13 at 23:14












  • $begingroup$
    It seems codereview strips the values I pasted from the binary and only keept the text ones. I'll try update the Input_4 binary or provide a link instead.
    $endgroup$
    – Rovdjuret
    Mar 14 at 7:41
















8












$begingroup$


Background



I got this interview test and got declined due to not meeting their expectations, but never got a reason on what was bad and how they solved it. I want to improve and learn something out of it.



Question



How can this be better written and what is badly written?



Assignment description



To optimize the amount of data sent between the web browser to the web server
a "clever" Javascript developer came up with the idea to translate JSON objects
into binary format in the application and send them to the server. Faced with
the fact that the Javascript is released in its final version to the customer
it is now your task to develop the parser on the back end system.



A JSON object is a hierarchy of key-value pairs where a value in its turn can
contain new key-value pairs. It consists of four basic types: numbers, strings,
arrays and dictionaries. An array is a list of values and a dictionay is a list
of key-value pairs. The key can only be of the type number or string while a
value can be of any type.



A JSON object always starts with a value.



An example JSON object can look like:



{
'firstName': 'John',
'lastName': 'Smith',
'age': 25,
'address': {
'streetAddress': '21 2nd Street',
'city': 'New York',
'state': 'NY',
'postalCode': '10021'
},
'phoneNumber': [
{ 'type': 'home', 'number': '212 555-1234' },
{ 'type': 'fax', 'number': '646 555-4567' }
]
}


A number is printed in decimal without any decoration.
Example: 25



A string is printed in ASCII with single quotes in the start and end of the
string.
Example: 'test'



A key-value pair is printed as key followed by colon (:), a space ( ) and the
value.
Example: 'a': 67



A dictionary starts and ends with curly brackets ({ and }) and then has a
comma (,) separated list of key-value pairs.
Example: { 'name': 'Joe', 'age': 31 }



An array starts and ends with square brackets ([ and ]) and then has a
comma (,) separated list of values.
Example: [ 'hello', 56, 'world' ]



The binary representation of the JSON object contains a one byte identifier
that describes the type of the data to follow and is then immediately followed
by the data.



The identifiers and their types are as follows:



Identifier  Type             Description
0x01 Number 4 bytes signed integer in big endian byte order.
0x02 String N ASCII characters terminated by 0x00.
0x05 List Amount of items as a number followed by N values
0x06 Dictionary Amount of items as a number followed by N
key-value pairs


The program's task is to parse a binary file and prints it as human readable
text. It should read the data from standard input and writes it the result to
standard output.



Look at the files 'input_x' and their respective 'result_x' for examples of
input and output. More background can be found on e.g. www.json.org



Input_4 binary



link to input_4 on gist



My solution



public class Main {

private static String INPUT_FILENAME = "input_4";
private static String OUTPUT_FILENAME = "result_4";

private static String RESOURCE_INPUT_PATH = "src/main/resources/input/";
private static String RESOURCE_OUTPUT_PATH = "src/main/resources/output/";

public static void main(String args) {
File resourcesDirectory = new File(String.format("%s%s", RESOURCE_INPUT_PATH, INPUT_FILENAME));
File file = new File(resourcesDirectory.getAbsolutePath());

try {
byte byteArray = Files.readAllBytes(file.toPath());

RecursiveParser recursiveParser = new RecursiveParser();

try {
String result = recursiveParser.parse(byteArray, 0, false).toString();

String prettyPrinted = prettyPrint(result);
BufferedWriter writer = new BufferedWriter(
new FileWriter(
new File(
String.format("%s%s%s", RESOURCE_OUTPUT_PATH, OUTPUT_FILENAME, ".json")
)
)
);
writer.write(prettyPrinted);

writer.close();
} catch (JSONException e) {
e.printStackTrace();
}
} catch (IOException e) {
e.printStackTrace();
}
}

private static String prettyPrint(String data) throws JSONException {
Object json = new JSONTokener(data).nextValue();
if (json instanceof JSONObject){
return (new JSONObject(data)).toString(4);
}
else if (json instanceof JSONArray){
return (new JSONArray(data)).toString(4);
} else {
return data; // nothing to pretty print
}
}
}


class RecursiveParser {
private static int TERMINATE = 0x00;

private static int NUMBER = 0x01; // Number 4 bytes signed integer in big endian byte order.
private static int STRING = 0x02; // String N ASCII characters terminated by 0x00.
private static int LIST = 0x05; // List Amount of items as a number followed by N values
private static int DICTIONARY = 0x06; // Dictionary Amount of items as a number followed by N key-value pairs

Object parse(byte byteArray, int index, boolean hasSub) throws JSONException {
for(; index < byteArray.length; index++){
if(byteArray[index] == NUMBER){
return getNumber(byteArray, index);
}
if(byteArray[index] == STRING){
return getString(byteArray, index);
}
if(byteArray[index] == LIST){
return getList(byteArray, index, hasSub);
}
if(byteArray[index] == DICTIONARY){
return getDictionary(byteArray, index, hasSub);
}
}
return null; // should never get here
}

private Object getDictionary(byte byteArray, int index, boolean hasSub) throws JSONException {
index++; // move to size after type because dictionary size

int dictionarySize = (int)parse(byteArray, index, hasSub);
index += ByteBuffer.allocate(4).putInt(dictionarySize).array().length;

JSONWriter jsonWriter = new JSONStringer()
.object();

for(int i = 0; i < dictionarySize; i++){
index++;

Object key = parse(byteArray, index, hasSub);

int keyLength = 0;
if(key instanceof Integer){
jsonWriter.key(String.valueOf(key));
keyLength += ByteBuffer.allocate(4).putInt((Integer) key).array().length;
} else if(key instanceof String) {
jsonWriter.key(String.valueOf(key));
keyLength += ((String) key).getBytes().length + 1;
}

index += keyLength + 1;
// check if sub-array or sub-dictionary
hasSub = hasSub || (byteArray[index] == DICTIONARY || byteArray[index] == LIST);

Object value = parse(byteArray, index, hasSub);

int valueLength = 0;
if (value instanceof Integer) {
jsonWriter.value(value);
valueLength += ByteBuffer.allocate(4).putInt((Integer) value).array().length;
} else if (value instanceof String) {
jsonWriter.value(value);
valueLength += String.valueOf(value).getBytes().length + 1;
} else if (value instanceof AbstractMap.SimpleEntry) {
valueLength = (int) ((AbstractMap.SimpleEntry) value).getKey() - index;
jsonWriter.value(((AbstractMap.SimpleEntry) value).getValue());
}

index += valueLength;
}

jsonWriter
.endObject();

return hasSub && index != (byteArray.length - 1) ? new AbstractMap.SimpleEntry<>(index, new JSONObject(jsonWriter.toString())) : new JSONObject(jsonWriter.toString());
}

private Object getList(byte byteArray, int index, boolean hasSub) throws JSONException {
index++; // move to size after type because list size

int listSize = (int)parse(byteArray, index, hasSub);
index += ByteBuffer.allocate(4).putInt(listSize).array().length;

JSONWriter jsonWriter = new JSONStringer().array();
for(int i = 0; i < listSize; i++){
index++;

// check if sub-array or sub-dictionary
hasSub = hasSub || byteArray[index] == DICTIONARY || byteArray[index] == LIST;

Object value = parse(byteArray, index, hasSub);

int valueLength = 0;
if (value instanceof Integer) {
jsonWriter.value(value);
valueLength += ByteBuffer.allocate(4).putInt((Integer) value).array().length;
} else if (value instanceof String) {
jsonWriter.value(value);
valueLength += String.valueOf(value).getBytes().length + 1;
} else if (value instanceof AbstractMap.SimpleEntry) {
valueLength = (int) ((AbstractMap.SimpleEntry) value).getKey() - index;
jsonWriter.value(((AbstractMap.SimpleEntry) value).getValue());
}

index += valueLength;
}
jsonWriter.endArray();

return hasSub && index != (byteArray.length - 1) ? new AbstractMap.SimpleEntry<>(index, new JSONArray(jsonWriter.toString())) : new JSONArray(jsonWriter.toString());
}

private String getString(byte byteArray, int index) {
int start = index + 1; // move to next value after type
StringBuilder value = new StringBuilder();
for(int i = start; i < byteArray.length; i++){
if(byteArray[i] == TERMINATE){
break;
}
value.append((char)byteArray[i]);
}
return value.toString();
}

private int getNumber(byte byteArray, int index) {
int start = index + 1; // move to next value after type
int offset = start + 4;

byte numberByteArray = Arrays.copyOfRange(byteArray, start, offset);

return new BigInteger(numberByteArray).intValue();
}
}


Result_4 output



{
"5": 25,
"deep": {
"1": "integer as key",
"2": {"4": 19088743},
"mix": "it is possible to mix integers and strings"
},
"first": 16777216,
"second": "value for second"
}









share|improve this question











$endgroup$












  • $begingroup$
    Would you mind adding additional examples? I don't quite follow the transformation from Input_4 to Result_4. Where is, for example, "5": 25 coming from?
    $endgroup$
    – ggorlen
    Mar 13 at 23:14












  • $begingroup$
    It seems codereview strips the values I pasted from the binary and only keept the text ones. I'll try update the Input_4 binary or provide a link instead.
    $endgroup$
    – Rovdjuret
    Mar 14 at 7:41














8












8








8


1



$begingroup$


Background



I got this interview test and got declined due to not meeting their expectations, but never got a reason on what was bad and how they solved it. I want to improve and learn something out of it.



Question



How can this be better written and what is badly written?



Assignment description



To optimize the amount of data sent between the web browser to the web server
a "clever" Javascript developer came up with the idea to translate JSON objects
into binary format in the application and send them to the server. Faced with
the fact that the Javascript is released in its final version to the customer
it is now your task to develop the parser on the back end system.



A JSON object is a hierarchy of key-value pairs where a value in its turn can
contain new key-value pairs. It consists of four basic types: numbers, strings,
arrays and dictionaries. An array is a list of values and a dictionay is a list
of key-value pairs. The key can only be of the type number or string while a
value can be of any type.



A JSON object always starts with a value.



An example JSON object can look like:



{
'firstName': 'John',
'lastName': 'Smith',
'age': 25,
'address': {
'streetAddress': '21 2nd Street',
'city': 'New York',
'state': 'NY',
'postalCode': '10021'
},
'phoneNumber': [
{ 'type': 'home', 'number': '212 555-1234' },
{ 'type': 'fax', 'number': '646 555-4567' }
]
}


A number is printed in decimal without any decoration.
Example: 25



A string is printed in ASCII with single quotes in the start and end of the
string.
Example: 'test'



A key-value pair is printed as key followed by colon (:), a space ( ) and the
value.
Example: 'a': 67



A dictionary starts and ends with curly brackets ({ and }) and then has a
comma (,) separated list of key-value pairs.
Example: { 'name': 'Joe', 'age': 31 }



An array starts and ends with square brackets ([ and ]) and then has a
comma (,) separated list of values.
Example: [ 'hello', 56, 'world' ]



The binary representation of the JSON object contains a one byte identifier
that describes the type of the data to follow and is then immediately followed
by the data.



The identifiers and their types are as follows:



Identifier  Type             Description
0x01 Number 4 bytes signed integer in big endian byte order.
0x02 String N ASCII characters terminated by 0x00.
0x05 List Amount of items as a number followed by N values
0x06 Dictionary Amount of items as a number followed by N
key-value pairs


The program's task is to parse a binary file and prints it as human readable
text. It should read the data from standard input and writes it the result to
standard output.



Look at the files 'input_x' and their respective 'result_x' for examples of
input and output. More background can be found on e.g. www.json.org



Input_4 binary



link to input_4 on gist



My solution



public class Main {

private static String INPUT_FILENAME = "input_4";
private static String OUTPUT_FILENAME = "result_4";

private static String RESOURCE_INPUT_PATH = "src/main/resources/input/";
private static String RESOURCE_OUTPUT_PATH = "src/main/resources/output/";

public static void main(String args) {
File resourcesDirectory = new File(String.format("%s%s", RESOURCE_INPUT_PATH, INPUT_FILENAME));
File file = new File(resourcesDirectory.getAbsolutePath());

try {
byte byteArray = Files.readAllBytes(file.toPath());

RecursiveParser recursiveParser = new RecursiveParser();

try {
String result = recursiveParser.parse(byteArray, 0, false).toString();

String prettyPrinted = prettyPrint(result);
BufferedWriter writer = new BufferedWriter(
new FileWriter(
new File(
String.format("%s%s%s", RESOURCE_OUTPUT_PATH, OUTPUT_FILENAME, ".json")
)
)
);
writer.write(prettyPrinted);

writer.close();
} catch (JSONException e) {
e.printStackTrace();
}
} catch (IOException e) {
e.printStackTrace();
}
}

private static String prettyPrint(String data) throws JSONException {
Object json = new JSONTokener(data).nextValue();
if (json instanceof JSONObject){
return (new JSONObject(data)).toString(4);
}
else if (json instanceof JSONArray){
return (new JSONArray(data)).toString(4);
} else {
return data; // nothing to pretty print
}
}
}


class RecursiveParser {
private static int TERMINATE = 0x00;

private static int NUMBER = 0x01; // Number 4 bytes signed integer in big endian byte order.
private static int STRING = 0x02; // String N ASCII characters terminated by 0x00.
private static int LIST = 0x05; // List Amount of items as a number followed by N values
private static int DICTIONARY = 0x06; // Dictionary Amount of items as a number followed by N key-value pairs

Object parse(byte byteArray, int index, boolean hasSub) throws JSONException {
for(; index < byteArray.length; index++){
if(byteArray[index] == NUMBER){
return getNumber(byteArray, index);
}
if(byteArray[index] == STRING){
return getString(byteArray, index);
}
if(byteArray[index] == LIST){
return getList(byteArray, index, hasSub);
}
if(byteArray[index] == DICTIONARY){
return getDictionary(byteArray, index, hasSub);
}
}
return null; // should never get here
}

private Object getDictionary(byte byteArray, int index, boolean hasSub) throws JSONException {
index++; // move to size after type because dictionary size

int dictionarySize = (int)parse(byteArray, index, hasSub);
index += ByteBuffer.allocate(4).putInt(dictionarySize).array().length;

JSONWriter jsonWriter = new JSONStringer()
.object();

for(int i = 0; i < dictionarySize; i++){
index++;

Object key = parse(byteArray, index, hasSub);

int keyLength = 0;
if(key instanceof Integer){
jsonWriter.key(String.valueOf(key));
keyLength += ByteBuffer.allocate(4).putInt((Integer) key).array().length;
} else if(key instanceof String) {
jsonWriter.key(String.valueOf(key));
keyLength += ((String) key).getBytes().length + 1;
}

index += keyLength + 1;
// check if sub-array or sub-dictionary
hasSub = hasSub || (byteArray[index] == DICTIONARY || byteArray[index] == LIST);

Object value = parse(byteArray, index, hasSub);

int valueLength = 0;
if (value instanceof Integer) {
jsonWriter.value(value);
valueLength += ByteBuffer.allocate(4).putInt((Integer) value).array().length;
} else if (value instanceof String) {
jsonWriter.value(value);
valueLength += String.valueOf(value).getBytes().length + 1;
} else if (value instanceof AbstractMap.SimpleEntry) {
valueLength = (int) ((AbstractMap.SimpleEntry) value).getKey() - index;
jsonWriter.value(((AbstractMap.SimpleEntry) value).getValue());
}

index += valueLength;
}

jsonWriter
.endObject();

return hasSub && index != (byteArray.length - 1) ? new AbstractMap.SimpleEntry<>(index, new JSONObject(jsonWriter.toString())) : new JSONObject(jsonWriter.toString());
}

private Object getList(byte byteArray, int index, boolean hasSub) throws JSONException {
index++; // move to size after type because list size

int listSize = (int)parse(byteArray, index, hasSub);
index += ByteBuffer.allocate(4).putInt(listSize).array().length;

JSONWriter jsonWriter = new JSONStringer().array();
for(int i = 0; i < listSize; i++){
index++;

// check if sub-array or sub-dictionary
hasSub = hasSub || byteArray[index] == DICTIONARY || byteArray[index] == LIST;

Object value = parse(byteArray, index, hasSub);

int valueLength = 0;
if (value instanceof Integer) {
jsonWriter.value(value);
valueLength += ByteBuffer.allocate(4).putInt((Integer) value).array().length;
} else if (value instanceof String) {
jsonWriter.value(value);
valueLength += String.valueOf(value).getBytes().length + 1;
} else if (value instanceof AbstractMap.SimpleEntry) {
valueLength = (int) ((AbstractMap.SimpleEntry) value).getKey() - index;
jsonWriter.value(((AbstractMap.SimpleEntry) value).getValue());
}

index += valueLength;
}
jsonWriter.endArray();

return hasSub && index != (byteArray.length - 1) ? new AbstractMap.SimpleEntry<>(index, new JSONArray(jsonWriter.toString())) : new JSONArray(jsonWriter.toString());
}

private String getString(byte byteArray, int index) {
int start = index + 1; // move to next value after type
StringBuilder value = new StringBuilder();
for(int i = start; i < byteArray.length; i++){
if(byteArray[i] == TERMINATE){
break;
}
value.append((char)byteArray[i]);
}
return value.toString();
}

private int getNumber(byte byteArray, int index) {
int start = index + 1; // move to next value after type
int offset = start + 4;

byte numberByteArray = Arrays.copyOfRange(byteArray, start, offset);

return new BigInteger(numberByteArray).intValue();
}
}


Result_4 output



{
"5": 25,
"deep": {
"1": "integer as key",
"2": {"4": 19088743},
"mix": "it is possible to mix integers and strings"
},
"first": 16777216,
"second": "value for second"
}









share|improve this question











$endgroup$




Background



I got this interview test and got declined due to not meeting their expectations, but never got a reason on what was bad and how they solved it. I want to improve and learn something out of it.



Question



How can this be better written and what is badly written?



Assignment description



To optimize the amount of data sent between the web browser to the web server
a "clever" Javascript developer came up with the idea to translate JSON objects
into binary format in the application and send them to the server. Faced with
the fact that the Javascript is released in its final version to the customer
it is now your task to develop the parser on the back end system.



A JSON object is a hierarchy of key-value pairs where a value in its turn can
contain new key-value pairs. It consists of four basic types: numbers, strings,
arrays and dictionaries. An array is a list of values and a dictionay is a list
of key-value pairs. The key can only be of the type number or string while a
value can be of any type.



A JSON object always starts with a value.



An example JSON object can look like:



{
'firstName': 'John',
'lastName': 'Smith',
'age': 25,
'address': {
'streetAddress': '21 2nd Street',
'city': 'New York',
'state': 'NY',
'postalCode': '10021'
},
'phoneNumber': [
{ 'type': 'home', 'number': '212 555-1234' },
{ 'type': 'fax', 'number': '646 555-4567' }
]
}


A number is printed in decimal without any decoration.
Example: 25



A string is printed in ASCII with single quotes in the start and end of the
string.
Example: 'test'



A key-value pair is printed as key followed by colon (:), a space ( ) and the
value.
Example: 'a': 67



A dictionary starts and ends with curly brackets ({ and }) and then has a
comma (,) separated list of key-value pairs.
Example: { 'name': 'Joe', 'age': 31 }



An array starts and ends with square brackets ([ and ]) and then has a
comma (,) separated list of values.
Example: [ 'hello', 56, 'world' ]



The binary representation of the JSON object contains a one byte identifier
that describes the type of the data to follow and is then immediately followed
by the data.



The identifiers and their types are as follows:



Identifier  Type             Description
0x01 Number 4 bytes signed integer in big endian byte order.
0x02 String N ASCII characters terminated by 0x00.
0x05 List Amount of items as a number followed by N values
0x06 Dictionary Amount of items as a number followed by N
key-value pairs


The program's task is to parse a binary file and prints it as human readable
text. It should read the data from standard input and writes it the result to
standard output.



Look at the files 'input_x' and their respective 'result_x' for examples of
input and output. More background can be found on e.g. www.json.org



Input_4 binary



link to input_4 on gist



My solution



public class Main {

private static String INPUT_FILENAME = "input_4";
private static String OUTPUT_FILENAME = "result_4";

private static String RESOURCE_INPUT_PATH = "src/main/resources/input/";
private static String RESOURCE_OUTPUT_PATH = "src/main/resources/output/";

public static void main(String args) {
File resourcesDirectory = new File(String.format("%s%s", RESOURCE_INPUT_PATH, INPUT_FILENAME));
File file = new File(resourcesDirectory.getAbsolutePath());

try {
byte byteArray = Files.readAllBytes(file.toPath());

RecursiveParser recursiveParser = new RecursiveParser();

try {
String result = recursiveParser.parse(byteArray, 0, false).toString();

String prettyPrinted = prettyPrint(result);
BufferedWriter writer = new BufferedWriter(
new FileWriter(
new File(
String.format("%s%s%s", RESOURCE_OUTPUT_PATH, OUTPUT_FILENAME, ".json")
)
)
);
writer.write(prettyPrinted);

writer.close();
} catch (JSONException e) {
e.printStackTrace();
}
} catch (IOException e) {
e.printStackTrace();
}
}

private static String prettyPrint(String data) throws JSONException {
Object json = new JSONTokener(data).nextValue();
if (json instanceof JSONObject){
return (new JSONObject(data)).toString(4);
}
else if (json instanceof JSONArray){
return (new JSONArray(data)).toString(4);
} else {
return data; // nothing to pretty print
}
}
}


class RecursiveParser {
private static int TERMINATE = 0x00;

private static int NUMBER = 0x01; // Number 4 bytes signed integer in big endian byte order.
private static int STRING = 0x02; // String N ASCII characters terminated by 0x00.
private static int LIST = 0x05; // List Amount of items as a number followed by N values
private static int DICTIONARY = 0x06; // Dictionary Amount of items as a number followed by N key-value pairs

Object parse(byte byteArray, int index, boolean hasSub) throws JSONException {
for(; index < byteArray.length; index++){
if(byteArray[index] == NUMBER){
return getNumber(byteArray, index);
}
if(byteArray[index] == STRING){
return getString(byteArray, index);
}
if(byteArray[index] == LIST){
return getList(byteArray, index, hasSub);
}
if(byteArray[index] == DICTIONARY){
return getDictionary(byteArray, index, hasSub);
}
}
return null; // should never get here
}

private Object getDictionary(byte byteArray, int index, boolean hasSub) throws JSONException {
index++; // move to size after type because dictionary size

int dictionarySize = (int)parse(byteArray, index, hasSub);
index += ByteBuffer.allocate(4).putInt(dictionarySize).array().length;

JSONWriter jsonWriter = new JSONStringer()
.object();

for(int i = 0; i < dictionarySize; i++){
index++;

Object key = parse(byteArray, index, hasSub);

int keyLength = 0;
if(key instanceof Integer){
jsonWriter.key(String.valueOf(key));
keyLength += ByteBuffer.allocate(4).putInt((Integer) key).array().length;
} else if(key instanceof String) {
jsonWriter.key(String.valueOf(key));
keyLength += ((String) key).getBytes().length + 1;
}

index += keyLength + 1;
// check if sub-array or sub-dictionary
hasSub = hasSub || (byteArray[index] == DICTIONARY || byteArray[index] == LIST);

Object value = parse(byteArray, index, hasSub);

int valueLength = 0;
if (value instanceof Integer) {
jsonWriter.value(value);
valueLength += ByteBuffer.allocate(4).putInt((Integer) value).array().length;
} else if (value instanceof String) {
jsonWriter.value(value);
valueLength += String.valueOf(value).getBytes().length + 1;
} else if (value instanceof AbstractMap.SimpleEntry) {
valueLength = (int) ((AbstractMap.SimpleEntry) value).getKey() - index;
jsonWriter.value(((AbstractMap.SimpleEntry) value).getValue());
}

index += valueLength;
}

jsonWriter
.endObject();

return hasSub && index != (byteArray.length - 1) ? new AbstractMap.SimpleEntry<>(index, new JSONObject(jsonWriter.toString())) : new JSONObject(jsonWriter.toString());
}

private Object getList(byte byteArray, int index, boolean hasSub) throws JSONException {
index++; // move to size after type because list size

int listSize = (int)parse(byteArray, index, hasSub);
index += ByteBuffer.allocate(4).putInt(listSize).array().length;

JSONWriter jsonWriter = new JSONStringer().array();
for(int i = 0; i < listSize; i++){
index++;

// check if sub-array or sub-dictionary
hasSub = hasSub || byteArray[index] == DICTIONARY || byteArray[index] == LIST;

Object value = parse(byteArray, index, hasSub);

int valueLength = 0;
if (value instanceof Integer) {
jsonWriter.value(value);
valueLength += ByteBuffer.allocate(4).putInt((Integer) value).array().length;
} else if (value instanceof String) {
jsonWriter.value(value);
valueLength += String.valueOf(value).getBytes().length + 1;
} else if (value instanceof AbstractMap.SimpleEntry) {
valueLength = (int) ((AbstractMap.SimpleEntry) value).getKey() - index;
jsonWriter.value(((AbstractMap.SimpleEntry) value).getValue());
}

index += valueLength;
}
jsonWriter.endArray();

return hasSub && index != (byteArray.length - 1) ? new AbstractMap.SimpleEntry<>(index, new JSONArray(jsonWriter.toString())) : new JSONArray(jsonWriter.toString());
}

private String getString(byte byteArray, int index) {
int start = index + 1; // move to next value after type
StringBuilder value = new StringBuilder();
for(int i = start; i < byteArray.length; i++){
if(byteArray[i] == TERMINATE){
break;
}
value.append((char)byteArray[i]);
}
return value.toString();
}

private int getNumber(byte byteArray, int index) {
int start = index + 1; // move to next value after type
int offset = start + 4;

byte numberByteArray = Arrays.copyOfRange(byteArray, start, offset);

return new BigInteger(numberByteArray).intValue();
}
}


Result_4 output



{
"5": 25,
"deep": {
"1": "integer as key",
"2": {"4": 19088743},
"mix": "it is possible to mix integers and strings"
},
"first": 16777216,
"second": "value for second"
}






java recursion interview-questions json






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 14 at 8:48







Rovdjuret

















asked Mar 13 at 16:27









RovdjuretRovdjuret

1435




1435












  • $begingroup$
    Would you mind adding additional examples? I don't quite follow the transformation from Input_4 to Result_4. Where is, for example, "5": 25 coming from?
    $endgroup$
    – ggorlen
    Mar 13 at 23:14












  • $begingroup$
    It seems codereview strips the values I pasted from the binary and only keept the text ones. I'll try update the Input_4 binary or provide a link instead.
    $endgroup$
    – Rovdjuret
    Mar 14 at 7:41


















  • $begingroup$
    Would you mind adding additional examples? I don't quite follow the transformation from Input_4 to Result_4. Where is, for example, "5": 25 coming from?
    $endgroup$
    – ggorlen
    Mar 13 at 23:14












  • $begingroup$
    It seems codereview strips the values I pasted from the binary and only keept the text ones. I'll try update the Input_4 binary or provide a link instead.
    $endgroup$
    – Rovdjuret
    Mar 14 at 7:41
















$begingroup$
Would you mind adding additional examples? I don't quite follow the transformation from Input_4 to Result_4. Where is, for example, "5": 25 coming from?
$endgroup$
– ggorlen
Mar 13 at 23:14






$begingroup$
Would you mind adding additional examples? I don't quite follow the transformation from Input_4 to Result_4. Where is, for example, "5": 25 coming from?
$endgroup$
– ggorlen
Mar 13 at 23:14














$begingroup$
It seems codereview strips the values I pasted from the binary and only keept the text ones. I'll try update the Input_4 binary or provide a link instead.
$endgroup$
– Rovdjuret
Mar 14 at 7:41




$begingroup$
It seems codereview strips the values I pasted from the binary and only keept the text ones. I'll try update the Input_4 binary or provide a link instead.
$endgroup$
– Rovdjuret
Mar 14 at 7:41










1 Answer
1






active

oldest

votes


















7












$begingroup$

There are two points that showed up immediately when looking at your implementation:




  • You missed one of the goals of the task: It should read the data from standard input and writes it the result to standard output.

  • You are reading in the complete binary data into memory and process it from there.


The latter is a no go, if we're talking about large structures of JSON-data, e.g. containing Base64-encoded binary data of a DVD-image (don't ask ;-) It might be unlikely in a real world example but it might let them come to the conclusion that you're not familiar with stream based processing and had another one how showed that ability which might have led to their decision.



Some remarks about the actual code:



Your reading-number-implementation:



private int getNumber(byte byteArray, int index) {
int start = index + 1; // move to next value after type
int offset = start + 4;

byte numberByteArray = Arrays.copyOfRange(byteArray, start, offset);

return new BigInteger(numberByteArray).intValue();
}


The specification said that a number is a signed integer in big endian order which is exactly how a Java int is defined, so instead of creating temporary arrays and a BigInteger you could simply have used an int and used bit-shifting:



private int getNumber(byte byteArray, int index) {
int ret = byteArray[index + 4] & 0xff;
ret |= (byteArray[index + 3] & 0xff) < 8;
ret |= (byteArray[index + 2] & 0xff) < 16;
ret |= (byteArray[index + 1] & 0xff) < 24;
return ret;
}


If you had implemented a stream based processing and used a DataInputStream the implementation would have been



private int getNumber(DataInputStream source) {
return source.readInt();
}


Your reading-text-implementation:



private String getString(byte byteArray, int index) {
int start = index + 1; // move to next value after type
StringBuilder value = new StringBuilder();
for(int i = start; i < byteArray.length; i++){
if(byteArray[i] == TERMINATE){
break;
}
value.append((char)byteArray[i]);
}
return value.toString();
}


Not much that can be changed here but for good measure, I'd change (char)byteArray[i] to (char) (byteArray[i] & 0xff). For ASCII-characters that's irrelevant but still ;-)



In getDictionary:



private Object getDictionary(byte byteArray, int index, boolean hasSub) throws JSONException {
[...]
index += ByteBuffer.allocate(4).putInt(dictionarySize).array().length;


That's a very elaborate form of



index += 4;


because that's the number of bytes a signed integer is defined in the given specification.



keyLength += ((String) key).getBytes().length + 1;


getBytes() uses the system's file-encoding. Because this particular example uses ASCII you won't see any effect on 99% of all systems but it would break on a system that runs e.g. with UTF-16 as file-encoding. You can see that youself by starting your Java test app with the system property -Dfile.encoding=UTF16.



This is a common Beginner's Error which might have raised a flag with them as well.



Always, ALWAYS use getBytes(encoding) unless you really want to use the system's file encoding for some reason.



hasSub = hasSub || (byteArray[index] == DICTIONARY || byteArray[index] == LIST);
[...]
} else if (value instanceof AbstractMap.SimpleEntry) {
valueLength = (int) ((AbstractMap.SimpleEntry) value).getKey() - index;
jsonWriter.value(((AbstractMap.SimpleEntry) value).getValue());
}


This code block is duplicated in getList, you should put that into its own method and call it from the two methods. Same is true for the logic with the return-statement. This should be put into its own method so you only need to fix it once in case you find a bug in it.



General stuff:



You had to cope with the fact that your get-methods had to return a value and should change the value of the index. That's not possible, so you decided to change the index value in the calling method in dependence of the type of the returned parsed value. This is "not optimal" to say the least.






share|improve this answer









$endgroup$













  • $begingroup$
    Thank you very much for your review, I see now that this wasn't one of my greatest moments such as missing a obvious requirement. I learned a lot from your review and will bring it into my future assignments. Really appreciate it, cheers!
    $endgroup$
    – Rovdjuret
    Mar 13 at 17:28










  • $begingroup$
    Also, your solution is just a standalone program. Implementing it as a class that can be directly imported into a project and having a separate unit tests probably would have helped. As to code style, it never hurts to mark your fields final and always throw an exception from code that should never be reached.
    $endgroup$
    – TorbenPutkonen
    Mar 14 at 8:14












  • $begingroup$
    Probably should have started a separate answer, but eh, here we go. Using FilterOutputStream and InputStream.transferto(...) would have shown the interviewer your knowledge about the standard APIs.
    $endgroup$
    – TorbenPutkonen
    Mar 14 at 8:23












  • $begingroup$
    @TorbenPutkonen Can you make it a proposed answer, please? :)
    $endgroup$
    – Rovdjuret
    Mar 14 at 9:01











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f215361%2frecursive-parser-from-binary-to-json-output%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









7












$begingroup$

There are two points that showed up immediately when looking at your implementation:




  • You missed one of the goals of the task: It should read the data from standard input and writes it the result to standard output.

  • You are reading in the complete binary data into memory and process it from there.


The latter is a no go, if we're talking about large structures of JSON-data, e.g. containing Base64-encoded binary data of a DVD-image (don't ask ;-) It might be unlikely in a real world example but it might let them come to the conclusion that you're not familiar with stream based processing and had another one how showed that ability which might have led to their decision.



Some remarks about the actual code:



Your reading-number-implementation:



private int getNumber(byte byteArray, int index) {
int start = index + 1; // move to next value after type
int offset = start + 4;

byte numberByteArray = Arrays.copyOfRange(byteArray, start, offset);

return new BigInteger(numberByteArray).intValue();
}


The specification said that a number is a signed integer in big endian order which is exactly how a Java int is defined, so instead of creating temporary arrays and a BigInteger you could simply have used an int and used bit-shifting:



private int getNumber(byte byteArray, int index) {
int ret = byteArray[index + 4] & 0xff;
ret |= (byteArray[index + 3] & 0xff) < 8;
ret |= (byteArray[index + 2] & 0xff) < 16;
ret |= (byteArray[index + 1] & 0xff) < 24;
return ret;
}


If you had implemented a stream based processing and used a DataInputStream the implementation would have been



private int getNumber(DataInputStream source) {
return source.readInt();
}


Your reading-text-implementation:



private String getString(byte byteArray, int index) {
int start = index + 1; // move to next value after type
StringBuilder value = new StringBuilder();
for(int i = start; i < byteArray.length; i++){
if(byteArray[i] == TERMINATE){
break;
}
value.append((char)byteArray[i]);
}
return value.toString();
}


Not much that can be changed here but for good measure, I'd change (char)byteArray[i] to (char) (byteArray[i] & 0xff). For ASCII-characters that's irrelevant but still ;-)



In getDictionary:



private Object getDictionary(byte byteArray, int index, boolean hasSub) throws JSONException {
[...]
index += ByteBuffer.allocate(4).putInt(dictionarySize).array().length;


That's a very elaborate form of



index += 4;


because that's the number of bytes a signed integer is defined in the given specification.



keyLength += ((String) key).getBytes().length + 1;


getBytes() uses the system's file-encoding. Because this particular example uses ASCII you won't see any effect on 99% of all systems but it would break on a system that runs e.g. with UTF-16 as file-encoding. You can see that youself by starting your Java test app with the system property -Dfile.encoding=UTF16.



This is a common Beginner's Error which might have raised a flag with them as well.



Always, ALWAYS use getBytes(encoding) unless you really want to use the system's file encoding for some reason.



hasSub = hasSub || (byteArray[index] == DICTIONARY || byteArray[index] == LIST);
[...]
} else if (value instanceof AbstractMap.SimpleEntry) {
valueLength = (int) ((AbstractMap.SimpleEntry) value).getKey() - index;
jsonWriter.value(((AbstractMap.SimpleEntry) value).getValue());
}


This code block is duplicated in getList, you should put that into its own method and call it from the two methods. Same is true for the logic with the return-statement. This should be put into its own method so you only need to fix it once in case you find a bug in it.



General stuff:



You had to cope with the fact that your get-methods had to return a value and should change the value of the index. That's not possible, so you decided to change the index value in the calling method in dependence of the type of the returned parsed value. This is "not optimal" to say the least.






share|improve this answer









$endgroup$













  • $begingroup$
    Thank you very much for your review, I see now that this wasn't one of my greatest moments such as missing a obvious requirement. I learned a lot from your review and will bring it into my future assignments. Really appreciate it, cheers!
    $endgroup$
    – Rovdjuret
    Mar 13 at 17:28










  • $begingroup$
    Also, your solution is just a standalone program. Implementing it as a class that can be directly imported into a project and having a separate unit tests probably would have helped. As to code style, it never hurts to mark your fields final and always throw an exception from code that should never be reached.
    $endgroup$
    – TorbenPutkonen
    Mar 14 at 8:14












  • $begingroup$
    Probably should have started a separate answer, but eh, here we go. Using FilterOutputStream and InputStream.transferto(...) would have shown the interviewer your knowledge about the standard APIs.
    $endgroup$
    – TorbenPutkonen
    Mar 14 at 8:23












  • $begingroup$
    @TorbenPutkonen Can you make it a proposed answer, please? :)
    $endgroup$
    – Rovdjuret
    Mar 14 at 9:01
















7












$begingroup$

There are two points that showed up immediately when looking at your implementation:




  • You missed one of the goals of the task: It should read the data from standard input and writes it the result to standard output.

  • You are reading in the complete binary data into memory and process it from there.


The latter is a no go, if we're talking about large structures of JSON-data, e.g. containing Base64-encoded binary data of a DVD-image (don't ask ;-) It might be unlikely in a real world example but it might let them come to the conclusion that you're not familiar with stream based processing and had another one how showed that ability which might have led to their decision.



Some remarks about the actual code:



Your reading-number-implementation:



private int getNumber(byte byteArray, int index) {
int start = index + 1; // move to next value after type
int offset = start + 4;

byte numberByteArray = Arrays.copyOfRange(byteArray, start, offset);

return new BigInteger(numberByteArray).intValue();
}


The specification said that a number is a signed integer in big endian order which is exactly how a Java int is defined, so instead of creating temporary arrays and a BigInteger you could simply have used an int and used bit-shifting:



private int getNumber(byte byteArray, int index) {
int ret = byteArray[index + 4] & 0xff;
ret |= (byteArray[index + 3] & 0xff) < 8;
ret |= (byteArray[index + 2] & 0xff) < 16;
ret |= (byteArray[index + 1] & 0xff) < 24;
return ret;
}


If you had implemented a stream based processing and used a DataInputStream the implementation would have been



private int getNumber(DataInputStream source) {
return source.readInt();
}


Your reading-text-implementation:



private String getString(byte byteArray, int index) {
int start = index + 1; // move to next value after type
StringBuilder value = new StringBuilder();
for(int i = start; i < byteArray.length; i++){
if(byteArray[i] == TERMINATE){
break;
}
value.append((char)byteArray[i]);
}
return value.toString();
}


Not much that can be changed here but for good measure, I'd change (char)byteArray[i] to (char) (byteArray[i] & 0xff). For ASCII-characters that's irrelevant but still ;-)



In getDictionary:



private Object getDictionary(byte byteArray, int index, boolean hasSub) throws JSONException {
[...]
index += ByteBuffer.allocate(4).putInt(dictionarySize).array().length;


That's a very elaborate form of



index += 4;


because that's the number of bytes a signed integer is defined in the given specification.



keyLength += ((String) key).getBytes().length + 1;


getBytes() uses the system's file-encoding. Because this particular example uses ASCII you won't see any effect on 99% of all systems but it would break on a system that runs e.g. with UTF-16 as file-encoding. You can see that youself by starting your Java test app with the system property -Dfile.encoding=UTF16.



This is a common Beginner's Error which might have raised a flag with them as well.



Always, ALWAYS use getBytes(encoding) unless you really want to use the system's file encoding for some reason.



hasSub = hasSub || (byteArray[index] == DICTIONARY || byteArray[index] == LIST);
[...]
} else if (value instanceof AbstractMap.SimpleEntry) {
valueLength = (int) ((AbstractMap.SimpleEntry) value).getKey() - index;
jsonWriter.value(((AbstractMap.SimpleEntry) value).getValue());
}


This code block is duplicated in getList, you should put that into its own method and call it from the two methods. Same is true for the logic with the return-statement. This should be put into its own method so you only need to fix it once in case you find a bug in it.



General stuff:



You had to cope with the fact that your get-methods had to return a value and should change the value of the index. That's not possible, so you decided to change the index value in the calling method in dependence of the type of the returned parsed value. This is "not optimal" to say the least.






share|improve this answer









$endgroup$













  • $begingroup$
    Thank you very much for your review, I see now that this wasn't one of my greatest moments such as missing a obvious requirement. I learned a lot from your review and will bring it into my future assignments. Really appreciate it, cheers!
    $endgroup$
    – Rovdjuret
    Mar 13 at 17:28










  • $begingroup$
    Also, your solution is just a standalone program. Implementing it as a class that can be directly imported into a project and having a separate unit tests probably would have helped. As to code style, it never hurts to mark your fields final and always throw an exception from code that should never be reached.
    $endgroup$
    – TorbenPutkonen
    Mar 14 at 8:14












  • $begingroup$
    Probably should have started a separate answer, but eh, here we go. Using FilterOutputStream and InputStream.transferto(...) would have shown the interviewer your knowledge about the standard APIs.
    $endgroup$
    – TorbenPutkonen
    Mar 14 at 8:23












  • $begingroup$
    @TorbenPutkonen Can you make it a proposed answer, please? :)
    $endgroup$
    – Rovdjuret
    Mar 14 at 9:01














7












7








7





$begingroup$

There are two points that showed up immediately when looking at your implementation:




  • You missed one of the goals of the task: It should read the data from standard input and writes it the result to standard output.

  • You are reading in the complete binary data into memory and process it from there.


The latter is a no go, if we're talking about large structures of JSON-data, e.g. containing Base64-encoded binary data of a DVD-image (don't ask ;-) It might be unlikely in a real world example but it might let them come to the conclusion that you're not familiar with stream based processing and had another one how showed that ability which might have led to their decision.



Some remarks about the actual code:



Your reading-number-implementation:



private int getNumber(byte byteArray, int index) {
int start = index + 1; // move to next value after type
int offset = start + 4;

byte numberByteArray = Arrays.copyOfRange(byteArray, start, offset);

return new BigInteger(numberByteArray).intValue();
}


The specification said that a number is a signed integer in big endian order which is exactly how a Java int is defined, so instead of creating temporary arrays and a BigInteger you could simply have used an int and used bit-shifting:



private int getNumber(byte byteArray, int index) {
int ret = byteArray[index + 4] & 0xff;
ret |= (byteArray[index + 3] & 0xff) < 8;
ret |= (byteArray[index + 2] & 0xff) < 16;
ret |= (byteArray[index + 1] & 0xff) < 24;
return ret;
}


If you had implemented a stream based processing and used a DataInputStream the implementation would have been



private int getNumber(DataInputStream source) {
return source.readInt();
}


Your reading-text-implementation:



private String getString(byte byteArray, int index) {
int start = index + 1; // move to next value after type
StringBuilder value = new StringBuilder();
for(int i = start; i < byteArray.length; i++){
if(byteArray[i] == TERMINATE){
break;
}
value.append((char)byteArray[i]);
}
return value.toString();
}


Not much that can be changed here but for good measure, I'd change (char)byteArray[i] to (char) (byteArray[i] & 0xff). For ASCII-characters that's irrelevant but still ;-)



In getDictionary:



private Object getDictionary(byte byteArray, int index, boolean hasSub) throws JSONException {
[...]
index += ByteBuffer.allocate(4).putInt(dictionarySize).array().length;


That's a very elaborate form of



index += 4;


because that's the number of bytes a signed integer is defined in the given specification.



keyLength += ((String) key).getBytes().length + 1;


getBytes() uses the system's file-encoding. Because this particular example uses ASCII you won't see any effect on 99% of all systems but it would break on a system that runs e.g. with UTF-16 as file-encoding. You can see that youself by starting your Java test app with the system property -Dfile.encoding=UTF16.



This is a common Beginner's Error which might have raised a flag with them as well.



Always, ALWAYS use getBytes(encoding) unless you really want to use the system's file encoding for some reason.



hasSub = hasSub || (byteArray[index] == DICTIONARY || byteArray[index] == LIST);
[...]
} else if (value instanceof AbstractMap.SimpleEntry) {
valueLength = (int) ((AbstractMap.SimpleEntry) value).getKey() - index;
jsonWriter.value(((AbstractMap.SimpleEntry) value).getValue());
}


This code block is duplicated in getList, you should put that into its own method and call it from the two methods. Same is true for the logic with the return-statement. This should be put into its own method so you only need to fix it once in case you find a bug in it.



General stuff:



You had to cope with the fact that your get-methods had to return a value and should change the value of the index. That's not possible, so you decided to change the index value in the calling method in dependence of the type of the returned parsed value. This is "not optimal" to say the least.






share|improve this answer









$endgroup$



There are two points that showed up immediately when looking at your implementation:




  • You missed one of the goals of the task: It should read the data from standard input and writes it the result to standard output.

  • You are reading in the complete binary data into memory and process it from there.


The latter is a no go, if we're talking about large structures of JSON-data, e.g. containing Base64-encoded binary data of a DVD-image (don't ask ;-) It might be unlikely in a real world example but it might let them come to the conclusion that you're not familiar with stream based processing and had another one how showed that ability which might have led to their decision.



Some remarks about the actual code:



Your reading-number-implementation:



private int getNumber(byte byteArray, int index) {
int start = index + 1; // move to next value after type
int offset = start + 4;

byte numberByteArray = Arrays.copyOfRange(byteArray, start, offset);

return new BigInteger(numberByteArray).intValue();
}


The specification said that a number is a signed integer in big endian order which is exactly how a Java int is defined, so instead of creating temporary arrays and a BigInteger you could simply have used an int and used bit-shifting:



private int getNumber(byte byteArray, int index) {
int ret = byteArray[index + 4] & 0xff;
ret |= (byteArray[index + 3] & 0xff) < 8;
ret |= (byteArray[index + 2] & 0xff) < 16;
ret |= (byteArray[index + 1] & 0xff) < 24;
return ret;
}


If you had implemented a stream based processing and used a DataInputStream the implementation would have been



private int getNumber(DataInputStream source) {
return source.readInt();
}


Your reading-text-implementation:



private String getString(byte byteArray, int index) {
int start = index + 1; // move to next value after type
StringBuilder value = new StringBuilder();
for(int i = start; i < byteArray.length; i++){
if(byteArray[i] == TERMINATE){
break;
}
value.append((char)byteArray[i]);
}
return value.toString();
}


Not much that can be changed here but for good measure, I'd change (char)byteArray[i] to (char) (byteArray[i] & 0xff). For ASCII-characters that's irrelevant but still ;-)



In getDictionary:



private Object getDictionary(byte byteArray, int index, boolean hasSub) throws JSONException {
[...]
index += ByteBuffer.allocate(4).putInt(dictionarySize).array().length;


That's a very elaborate form of



index += 4;


because that's the number of bytes a signed integer is defined in the given specification.



keyLength += ((String) key).getBytes().length + 1;


getBytes() uses the system's file-encoding. Because this particular example uses ASCII you won't see any effect on 99% of all systems but it would break on a system that runs e.g. with UTF-16 as file-encoding. You can see that youself by starting your Java test app with the system property -Dfile.encoding=UTF16.



This is a common Beginner's Error which might have raised a flag with them as well.



Always, ALWAYS use getBytes(encoding) unless you really want to use the system's file encoding for some reason.



hasSub = hasSub || (byteArray[index] == DICTIONARY || byteArray[index] == LIST);
[...]
} else if (value instanceof AbstractMap.SimpleEntry) {
valueLength = (int) ((AbstractMap.SimpleEntry) value).getKey() - index;
jsonWriter.value(((AbstractMap.SimpleEntry) value).getValue());
}


This code block is duplicated in getList, you should put that into its own method and call it from the two methods. Same is true for the logic with the return-statement. This should be put into its own method so you only need to fix it once in case you find a bug in it.



General stuff:



You had to cope with the fact that your get-methods had to return a value and should change the value of the index. That's not possible, so you decided to change the index value in the calling method in dependence of the type of the returned parsed value. This is "not optimal" to say the least.







share|improve this answer












share|improve this answer



share|improve this answer










answered Mar 13 at 17:16









LotharLothar

1862




1862












  • $begingroup$
    Thank you very much for your review, I see now that this wasn't one of my greatest moments such as missing a obvious requirement. I learned a lot from your review and will bring it into my future assignments. Really appreciate it, cheers!
    $endgroup$
    – Rovdjuret
    Mar 13 at 17:28










  • $begingroup$
    Also, your solution is just a standalone program. Implementing it as a class that can be directly imported into a project and having a separate unit tests probably would have helped. As to code style, it never hurts to mark your fields final and always throw an exception from code that should never be reached.
    $endgroup$
    – TorbenPutkonen
    Mar 14 at 8:14












  • $begingroup$
    Probably should have started a separate answer, but eh, here we go. Using FilterOutputStream and InputStream.transferto(...) would have shown the interviewer your knowledge about the standard APIs.
    $endgroup$
    – TorbenPutkonen
    Mar 14 at 8:23












  • $begingroup$
    @TorbenPutkonen Can you make it a proposed answer, please? :)
    $endgroup$
    – Rovdjuret
    Mar 14 at 9:01


















  • $begingroup$
    Thank you very much for your review, I see now that this wasn't one of my greatest moments such as missing a obvious requirement. I learned a lot from your review and will bring it into my future assignments. Really appreciate it, cheers!
    $endgroup$
    – Rovdjuret
    Mar 13 at 17:28










  • $begingroup$
    Also, your solution is just a standalone program. Implementing it as a class that can be directly imported into a project and having a separate unit tests probably would have helped. As to code style, it never hurts to mark your fields final and always throw an exception from code that should never be reached.
    $endgroup$
    – TorbenPutkonen
    Mar 14 at 8:14












  • $begingroup$
    Probably should have started a separate answer, but eh, here we go. Using FilterOutputStream and InputStream.transferto(...) would have shown the interviewer your knowledge about the standard APIs.
    $endgroup$
    – TorbenPutkonen
    Mar 14 at 8:23












  • $begingroup$
    @TorbenPutkonen Can you make it a proposed answer, please? :)
    $endgroup$
    – Rovdjuret
    Mar 14 at 9:01
















$begingroup$
Thank you very much for your review, I see now that this wasn't one of my greatest moments such as missing a obvious requirement. I learned a lot from your review and will bring it into my future assignments. Really appreciate it, cheers!
$endgroup$
– Rovdjuret
Mar 13 at 17:28




$begingroup$
Thank you very much for your review, I see now that this wasn't one of my greatest moments such as missing a obvious requirement. I learned a lot from your review and will bring it into my future assignments. Really appreciate it, cheers!
$endgroup$
– Rovdjuret
Mar 13 at 17:28












$begingroup$
Also, your solution is just a standalone program. Implementing it as a class that can be directly imported into a project and having a separate unit tests probably would have helped. As to code style, it never hurts to mark your fields final and always throw an exception from code that should never be reached.
$endgroup$
– TorbenPutkonen
Mar 14 at 8:14






$begingroup$
Also, your solution is just a standalone program. Implementing it as a class that can be directly imported into a project and having a separate unit tests probably would have helped. As to code style, it never hurts to mark your fields final and always throw an exception from code that should never be reached.
$endgroup$
– TorbenPutkonen
Mar 14 at 8:14














$begingroup$
Probably should have started a separate answer, but eh, here we go. Using FilterOutputStream and InputStream.transferto(...) would have shown the interviewer your knowledge about the standard APIs.
$endgroup$
– TorbenPutkonen
Mar 14 at 8:23






$begingroup$
Probably should have started a separate answer, but eh, here we go. Using FilterOutputStream and InputStream.transferto(...) would have shown the interviewer your knowledge about the standard APIs.
$endgroup$
– TorbenPutkonen
Mar 14 at 8:23














$begingroup$
@TorbenPutkonen Can you make it a proposed answer, please? :)
$endgroup$
– Rovdjuret
Mar 14 at 9:01




$begingroup$
@TorbenPutkonen Can you make it a proposed answer, please? :)
$endgroup$
– Rovdjuret
Mar 14 at 9:01


















draft saved

draft discarded




















































Thanks for contributing an answer to Code Review Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f215361%2frecursive-parser-from-binary-to-json-output%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

How to change which sound is reproduced for terminal bell?

Can I use Tabulator js library in my java Spring + Thymeleaf project?

Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents