JSON Parser
Introduction
Section titled “Introduction”In this comprehensive case study, we’ll design a JSON Parser from scratch using the systematic 8-step approach. This problem is excellent for learning LLD because it demonstrates fundamental computer science concepts like lexical analysis, parsing, and error handling.
Problem Statement
Section titled “Problem Statement”Design a JSON Parser that can:
- Convert JSON strings into corresponding data structures (objects, arrays, primitives)
- Support all JSON data types: strings, numbers (integers and floats), booleans, null, objects, arrays
- Handle nested objects and arrays of arbitrary depth
- Validate JSON syntax during parsing and throw exceptions for invalid JSON
- Provide meaningful error messages with position information (line and column)
- Correctly handle Unicode characters and escape sequences (
\n,\t,\r,\",\\,\uXXXX) - Handle whitespace (spaces, tabs, newlines) correctly between JSON elements
- Support empty objects
{}and empty arrays[]
Step 1: Clarify Requirements
Section titled “Step 1: Clarify Requirements”Before designing, let’s ask the right questions!
Clarifying Questions
Section titled “Clarifying Questions”Functional Requirements:
- JSON Data Types: What types should be supported? → Strings, numbers (integers and floats), booleans (true/false), null, objects, arrays
- Nested Structures: How deep can nesting go? → Arbitrary depth, but should handle gracefully
- Error Handling: How should invalid JSON be handled? → Throw exceptions with clear error messages and position information
- Unicode Support: Should Unicode characters be supported? → Yes, including escape sequences (
\n,\t,\uXXXX, etc.) - Whitespace: How should whitespace be handled? → Correctly handle spaces, tabs, newlines between elements
- Comments: Should comments be supported? → No, JSON doesn’t support comments
Non-Functional Requirements:
- Separation of Concerns: Should tokenization and parsing be separate? → Yes, clear separation
- Parsing Approach: What parsing technique? → Recursive descent parsing
- Error Messages: What level of detail? → Line and column numbers for debugging
- Extensibility: Should it be easy to extend? → Yes, allow easy addition of new features
Edge Cases to Consider:
- What if JSON string is empty?
- What if string is unterminated?
- What if number format is invalid?
- What if escape sequence is invalid?
- What if structure is malformed (missing commas, colons)?
- What if nesting is too deep?
Requirements Summary
Section titled “Requirements Summary”Step 2: Identify Actors
Section titled “Step 2: Identify Actors”Actors are external entities that interact with the system.
Who Uses This System?
Section titled “Who Uses This System?”Primary Actors:
- Client Application - Uses the JSON parser to parse JSON strings into data structures
- Developer - Uses the parser in their application code
Note: The parser is a library component used by other applications. It doesn’t have external actors in the traditional sense, but it serves client applications that need to parse JSON.
Actor Interactions
Section titled “Actor Interactions”Step 3: Identify Entities
Section titled “Step 3: Identify Entities”Entities are core objects in your system that have data and behavior.
The Noun Hunt
Section titled “The Noun Hunt”Looking at the requirements, we identify these core entities:
- JsonParser - Main orchestrator, performs syntactic analysis
- JsonTokenizer - Performs lexical analysis (tokenization)
- Token - Represents a single token with type, value, and position
- TokenType - Enum for token types (LEFT_BRACE, STRING, NUMBER, etc.)
- JsonValue - Abstract base class for all JSON value types
- JsonObject - Represents JSON object (key-value pairs)
- JsonArray - Represents JSON array (ordered list)
- JsonString - Represents JSON string value
- JsonNumber - Represents JSON number value
- JsonBoolean - Represents JSON boolean value
- JsonNull - Represents JSON null value
- JsonParseException - Custom exception with position information
Entity Relationships
Section titled “Entity Relationships”Each class should have a single, well-defined responsibility (SRP).
Responsibility Assignment
Section titled “Responsibility Assignment”JsonParser:
- Orchestrate parsing process using recursive descent
- Maintain current token
- Coordinate between tokenizer and value construction
- Parse objects, arrays, and primitive values
- Validate JSON structure
JsonTokenizer:
- Break JSON string into tokens (lexical analysis)
- Skip whitespace while tracking position
- Parse strings with escape sequences
- Parse numbers (integers, floats, scientific notation)
- Parse keywords (true, false, null)
- Track line and column numbers for error reporting
Token:
- Store token type, value, and position information
- Provide access to token properties
JsonValue (Abstract):
- Define common interface for all JSON types
- Provide
getValue()andtoString()methods
JsonObject:
- Store key-value pairs as map
- Provide methods to add/get properties
- Convert to native map structure
JsonArray:
- Store elements as ordered list
- Provide methods to add/get elements
- Convert to native list structure
JsonString, JsonNumber, JsonBoolean, JsonNull:
- Represent primitive JSON values
- Store and return appropriate value types
JsonParseException:
- Extend standard exception
- Include line and column information
- Provide formatted error messages
Responsibility Visualization
Section titled “Responsibility Visualization”Step 5: Design Class Diagrams
Section titled “Step 5: Design Class Diagrams”Class diagrams show structure, relationships, and design patterns.
classDiagram
class TokenType {
<<enumeration>>
LEFT_BRACE
RIGHT_BRACE
LEFT_BRACKET
RIGHT_BRACKET
COLON
COMMA
STRING
NUMBER
BOOLEAN_TRUE
BOOLEAN_FALSE
NULL
EOF
}
class JsonParseException {
-String message
-int line
-int column
+getLine() int
+getColumn() int
+getErrorMessage() String
}
class Token {
-TokenType type
-Object value
-int line
-int column
+getType() TokenType
+getValue() Object
+getLine() int
+getColumn() int
}
class JsonTokenizer {
-String input
-int position
-int line
-int column
+nextToken() Token
-skipWhitespace() void
-parseString() Token
-parseNumber() Token
-parseKeyword(String, TokenType, int, int) Token
-advance() void
+getCurrentLine() int
+getCurrentColumn() int
}
class JsonValue {
<<abstract>>
+getValue() Object
+toString() String
}
class JsonString {
-String value
+getValue() String
+toString() String
-escapeString(String) String
}
class JsonNumber {
-Number value
+getValue() Number
+toString() String
}
class JsonBoolean {
-boolean value
+getValue() boolean
+toString() String
}
class JsonNull {
+getValue() null
+toString() String
}
class JsonObject {
-Map~String,JsonValue~ properties
+put(String, JsonValue) void
+get(String) JsonValue
+getProperties() Map
+getValue() Map~String,Object~
+toString() String
}
class JsonArray {
-List~JsonValue~ elements
+add(JsonValue) void
+get(int) JsonValue
+size() int
+getElements() List
+getValue() List~Object~
+toString() String
}
class JsonParser {
-JsonTokenizer tokenizer
-Token currentToken
+parse(String) JsonValue
-parseValue() JsonValue
-parseObject() JsonObject
-parseArray() JsonArray
-consumeToken() void
-consumeToken(TokenType) void
}
JsonValue <|-- JsonString
JsonValue <|-- JsonNumber
JsonValue <|-- JsonBoolean
JsonValue <|-- JsonNull
JsonValue <|-- JsonObject
JsonValue <|-- JsonArray
JsonParser --> JsonTokenizer : uses
JsonParser --> Token : uses
JsonParser --> JsonValue : creates
JsonTokenizer --> Token : creates
JsonTokenizer --> JsonParseException : throws
JsonParser --> JsonParseException : throws
JsonObject --> JsonValue : contains
JsonArray --> JsonValue : contains
Token --> TokenType : uses
1. Separation of Concerns - Tokenizer vs Parser
- Lexical analysis (tokenization) separated from syntactic analysis (parsing)
- Easier to test and maintain
2. Recursive Descent Parsing - Natural fit for JSON
- Parsing methods call themselves recursively
- Handles nested structures naturally
3. Polymorphism - JsonValue hierarchy
- All JSON types extend JsonValue
- Common interface for all value types
4. Template Method Pattern - JsonValue interface
- Defines common structure for all JSON types
- Each subclass implements specific behavior
Step 6: Define Contracts & APIs
Section titled “Step 6: Define Contracts & APIs”Contracts define how classes interact - method signatures, interfaces, and behavior.
JsonValue Abstract Class:
1from abc import ABC, abstractmethod2
3class JsonValue(ABC):4 """5 Abstract base class for all JSON value types.6 Provides common interface for all JSON types.7 """8 @abstractmethod9 def get_value(self):10 """11 Get the native value representation.12
13 Returns:14 Object: Native value (dict, list, str, int, float, bool, None)15 """16 pass17
18 @abstractmethod19 def __str__(self):20 """21 Convert to JSON string representation.22
23 Returns:24 str: JSON string representation25 """26 passJsonTokenizer:
1class JsonTokenizer:2 """3 Performs lexical analysis - breaks JSON string into tokens.4 """5 def next_token(self) -> Token:6 """7 Get the next token from input.8
9 Returns:10 Token: Next token in the stream11
12 Raises:13 JsonParseException: If invalid character or format encountered14 """15 pass16
17 def get_current_line(self) -> int:18 """Get current line number (1-indexed)."""19 pass20
21 def get_current_column(self) -> int:22 """Get current column number (1-indexed)."""23 passJsonValue Abstract Class:
1abstract class JsonValue {2 /**3 * Get the native value representation.4 * @return Native value (Map, List, String, Number, Boolean, null)5 */6 public abstract Object getValue();7
8 /**9 * Convert to JSON string representation.10 * @return JSON string representation11 */12 public abstract String toString();13}JsonTokenizer:
1class JsonTokenizer {2 /**3 * Get the next token from input.4 * @return Next token in the stream5 * @throws JsonParseException If invalid character or format encountered6 */7 public Token nextToken() throws JsonParseException;8
9 /** Get current line number (1-indexed). */10 public int getCurrentLine();11
12 /** Get current column number (1-indexed). */13 public int getCurrentColumn();14}Core Class Contracts
Section titled “Core Class Contracts”JsonParser:
1class JsonParser:2 """3 Main parser that orchestrates JSON parsing using recursive descent.4 """5 def __init__(self, json_string: str):6 """7 Initialize parser with JSON string.8
9 Args:10 json_string: JSON string to parse11 """12 pass13
14 def parse(self) -> JsonValue:15 """16 Parse JSON string into JsonValue.17
18 Returns:19 JsonValue: Parsed JSON value20
21 Raises:22 JsonParseException: If JSON is invalid or malformed23 """24 pass25
26 def _parse_value(self) -> JsonValue:27 """28 Parse a JSON value (object, array, or primitive).29 Internal method used recursively.30 """31 pass32
33 def _parse_object(self) -> JsonObject:34 """35 Parse a JSON object (key-value pairs).36 Internal method used recursively.37 """38 pass39
40 def _parse_array(self) -> JsonArray:41 """42 Parse a JSON array (ordered list).43 Internal method used recursively.44 """45 passJsonParser:
1class JsonParser {2 /**3 * Initialize parser with JSON string.4 * @param jsonString JSON string to parse5 */6 public JsonParser(String jsonString);7
8 /**9 * Parse JSON string into JsonValue.10 * @return Parsed JSON value11 * @throws JsonParseException If JSON is invalid or malformed12 */13 public JsonValue parse() throws JsonParseException;14
15 /**16 * Parse a JSON value (object, array, or primitive).17 * Internal method used recursively.18 */19 private JsonValue parseValue() throws JsonParseException;20
21 /**22 * Parse a JSON object (key-value pairs).23 * Internal method used recursively.24 */25 private JsonObject parseObject() throws JsonParseException;26
27 /**28 * Parse a JSON array (ordered list).29 * Internal method used recursively.30 */31 private JsonArray parseArray() throws JsonParseException;32}Contract Visualization
Section titled “Contract Visualization”Step 7: Handle Edge Cases
Section titled “Step 7: Handle Edge Cases”Edge cases are scenarios that might not be obvious but are important to handle.
Critical Edge Cases
Section titled “Critical Edge Cases”1. Empty JSON String
- Return appropriate error or handle gracefully
- Validate input is not empty before parsing
2. Unterminated Strings
- String not closed with closing quote
- Track position and throw exception with line/column
3. Invalid Number Format
- Missing digits after decimal point:
12. - Missing digits in exponent:
1.5e - Invalid characters in number
- Throw exception with position information
4. Invalid Escape Sequences
- Invalid escape:
\x(not\n,\t, etc.) - Incomplete Unicode:
\u12(needs 4 hex digits) - Invalid hex digits:
\uXYZW - Throw exception with position information
5. Malformed Structure
- Missing commas:
{"key1": "value1" "key2": "value2"} - Missing colons:
{"key" "value"} - Unexpected tokens:
{"key": } - Extra commas:
[1, 2, 3,] - Throw exception with position information
6. Deep Nesting
- Extremely nested structures might cause stack overflow
- Consider depth limit or iterative approach for very deep nesting
7. Invalid Keywords
- Partial keywords:
tru,fals,nul - Validate complete keyword matches
8. Whitespace Handling
- Multiple spaces, tabs, newlines between elements
- Track line/column correctly across newlines
Edge Case Handling Flow
Section titled “Edge Case Handling Flow”Error Handling Examples
Section titled “Error Handling Examples”1class JsonParseException(Exception):2 def __init__(self, message: str, line: int, column: int):3 self.message = message4 self.line = line5 self.column = column6 super().__init__(f"JSON Parse Error at line {line}, column {column}: {message}")7
8# Example error scenarios9try:10 parser = JsonParser('{"key": "unclosed')11 parser.parse()12except JsonParseException as e:13 print(f"Error: {e.message} at {e.line}:{e.column}")14 # Output: Error: Unterminated string at 1:1515
16try:17 parser = JsonParser('{"num": 12.}')18 parser.parse()19except JsonParseException as e:20 print(f"Error: {e.message} at {e.line}:{e.column}")21 # Output: Error: Invalid number format: expected digit after decimal point at 1:121class JsonParseException extends Exception {2 private int line;3 private int column;4 private String message;5
6 public JsonParseException(String message, int line, int column) {7 super(String.format("JSON Parse Error at line %d, column %d: %s",8 line, column, message));9 this.message = message;10 this.line = line;11 this.column = column;12 }13}14
15// Example error scenarios16try {17 JsonParser parser = new JsonParser("{\"key\": \"unclosed");18 parser.parse();19} catch (JsonParseException e) {20 System.out.println("Error: " + e.getMessage());21 // Output: Error: JSON Parse Error at line 1, column 15: Unterminated string22}Step 8: Code Implementation
Section titled “Step 8: Code Implementation”Finally, implement your design following SOLID principles and design patterns.
Complete Implementation
Section titled “Complete Implementation”1# File: exceptions/json_parse_exception.py2class JsonParseException(Exception):3 def __init__(self, message: str, line: int, column: int):4 self.message = message5 self.line = line6 self.column = column7 super().__init__(f"JSON Parse Error at line {line}, column {column}: {message}")8
9# File: models/token_type.py10from enum import Enum11
12class TokenType(Enum):13 LEFT_BRACE = "LEFT_BRACE"14 RIGHT_BRACE = "RIGHT_BRACE"15 LEFT_BRACKET = "LEFT_BRACKET"16 RIGHT_BRACKET = "RIGHT_BRACKET"17 COLON = "COLON"18 COMMA = "COMMA"19 STRING = "STRING"20 NUMBER = "NUMBER"21 BOOLEAN_TRUE = "BOOLEAN_TRUE"22 BOOLEAN_FALSE = "BOOLEAN_FALSE"23 NULL = "NULL"24 EOF = "EOF"25
26# File: models/token.py27class Token:28 def __init__(self, token_type: TokenType, value=None, line: int = 0, column: int = 0):29 self.type = token_type30 self.value = value31 self.line = line32 self.column = column33
34# File: models/json_value.py35from abc import ABC, abstractmethod36
37class JsonValue(ABC):38 @abstractmethod39 def get_value(self):40 pass41
42 @abstractmethod43 def __str__(self):44 pass45
46# File: models/json_string.py47class JsonString(JsonValue):48 def __init__(self, value: str):49 self.value = value50
51 def get_value(self):52 return self.value53
54 def __str__(self):55 return f'"{self._escape_string(self.value)}"'56
57 def _escape_string(self, s: str) -> str:58 result = []59 for char in s:60 if char == '"':61 result.append('\\"')62 elif char == '\\':63 result.append('\\\\')64 elif char == '\n':65 result.append('\\n')66 elif char == '\r':67 result.append('\\r')68 elif char == '\t':69 result.append('\\t')70 else:71 result.append(char)72 return ''.join(result)73
74# File: models/json_number.py75class JsonNumber(JsonValue):76 def __init__(self, value):77 self.value = value78
79 def get_value(self):80 return self.value81
82 def __str__(self):83 return str(self.value)84
85# File: models/json_boolean.py86class JsonBoolean(JsonValue):87 def __init__(self, value: bool):88 self.value = value89
90 def get_value(self):91 return self.value92
93 def __str__(self):94 return "true" if self.value else "false"95
96# File: models/json_null.py97class JsonNull(JsonValue):98 def get_value(self):99 return None100
101 def __str__(self):102 return "null"103
104# File: models/json_object.py105class JsonObject(JsonValue):106 def __init__(self):107 self.properties = {}108
109 def put(self, key: str, value: JsonValue):110 self.properties[key] = value111
112 def get(self, key: str):113 return self.properties.get(key)114
115 def get_value(self):116 return {k: v.get_value() for k, v in self.properties.items()}117
118 def __str__(self):119 items = [f'"{k}": {str(v)}' for k, v in self.properties.items()]120 return "{" + ", ".join(items) + "}"121
122# File: models/json_array.py123class JsonArray(JsonValue):124 def __init__(self):125 self.elements = []126
127 def add(self, value: JsonValue):128 self.elements.append(value)129
130 def get_value(self):131 return [elem.get_value() for elem in self.elements]132
133 def __str__(self):134 items = [str(elem) for elem in self.elements]135 return "[" + ", ".join(items) + "]"136
137# File: tokenizer/json_tokenizer.py138class JsonTokenizer:139 def __init__(self, input_str: str):140 self.input = input_str141 self.position = 0142 self.line = 1143 self.column = 1144
145 def next_token(self) -> Token:146 self._skip_whitespace()147
148 if self.position >= len(self.input):149 return Token(TokenType.EOF, line=self.line, column=self.column)150
151 current = self.input[self.position]152 start_line = self.line153 start_column = self.column154
155 if current == '{':156 self._advance()157 return Token(TokenType.LEFT_BRACE, line=start_line, column=start_column)158 elif current == '}':159 self._advance()160 return Token(TokenType.RIGHT_BRACE, line=start_line, column=start_column)161 elif current == '[':162 self._advance()163 return Token(TokenType.LEFT_BRACKET, line=start_line, column=start_column)164 elif current == ']':165 self._advance()166 return Token(TokenType.RIGHT_BRACKET, line=start_line, column=start_column)167 elif current == ':':168 self._advance()169 return Token(TokenType.COLON, line=start_line, column=start_column)170 elif current == ',':171 self._advance()172 return Token(TokenType.COMMA, line=start_line, column=start_column)173 elif current == '"':174 return self._parse_string()175 elif current == 't':176 return self._parse_keyword("true", TokenType.BOOLEAN_TRUE, start_line, start_column)177 elif current == 'f':178 return self._parse_keyword("false", TokenType.BOOLEAN_FALSE, start_line, start_column)179 elif current == 'n':180 return self._parse_keyword("null", TokenType.NULL, start_line, start_column)181 elif current in '-0123456789':182 return self._parse_number()183 else:184 raise JsonParseException(f"Unexpected character: {current}", start_line, start_column)185
186 def _skip_whitespace(self):187 while self.position < len(self.input):188 c = self.input[self.position]189 if c == ' ' or c == '\t':190 self._advance()191 elif c == '\n':192 self.line += 1193 self.column = 1194 self.position += 1195 elif c == '\r':196 if self.position + 1 < len(self.input) and self.input[self.position + 1] == '\n':197 self.position += 2198 else:199 self.position += 1200 self.line += 1201 self.column = 1202 else:203 break204
205 def _advance(self):206 self.position += 1207 self.column += 1208
209 def _parse_string(self) -> Token:210 start_line = self.line211 start_column = self.column212 self._advance() # skip opening quote213 result = []214
215 while self.position < len(self.input):216 c = self.input[self.position]217 if c == '"':218 self._advance()219 return Token(TokenType.STRING, ''.join(result), start_line, start_column)220 elif c == '\\':221 self._advance()222 if self.position >= len(self.input):223 raise JsonParseException("Unterminated string", self.line, self.column)224 escaped = self.input[self.position]225 if escaped == '"':226 result.append('"')227 self._advance()228 elif escaped == '\\':229 result.append('\\')230 self._advance()231 elif escaped == '/':232 result.append('/')233 self._advance()234 elif escaped == 'b':235 result.append('\b')236 self._advance()237 elif escaped == 'f':238 result.append('\f')239 self._advance()240 elif escaped == 'n':241 result.append('\n')242 self._advance()243 elif escaped == 'r':244 result.append('\r')245 self._advance()246 elif escaped == 't':247 result.append('\t')248 self._advance()249 elif escaped == 'u':250 self._advance()251 if self.position + 4 > len(self.input):252 raise JsonParseException("Incomplete Unicode escape sequence", self.line, self.column)253 hex_str = self.input[self.position:self.position + 4]254 try:255 code_point = int(hex_str, 16)256 result.append(chr(code_point))257 self.position += 4258 self.column += 4259 except ValueError:260 raise JsonParseException(f"Invalid Unicode escape sequence: \\u{hex_str}", self.line, self.column)261 else:262 raise JsonParseException(f"Invalid escape sequence: \\{escaped}", self.line, self.column)263 elif c == '\n' or c == '\r':264 raise JsonParseException("Unterminated string: newline in string", self.line, self.column)265 else:266 result.append(c)267 self._advance()268
269 raise JsonParseException("Unterminated string", self.line, self.column)270
271 def _parse_keyword(self, keyword: str, token_type: TokenType, start_line: int, start_column: int) -> Token:272 if self.position + len(keyword) > len(self.input):273 raise JsonParseException("Unexpected end of input", start_line, start_column)274 actual = self.input[self.position:self.position + len(keyword)]275 if actual != keyword:276 raise JsonParseException(f"Unexpected token: {actual}", start_line, start_column)277 self.position += len(keyword)278 self.column += len(keyword)279 return Token(token_type, line=start_line, column=start_column)280
281 def _parse_number(self) -> Token:282 start_line = self.line283 start_column = self.column284 result = []285 has_decimal = False286
287 if self.input[self.position] == '-':288 result.append('-')289 self._advance()290
291 if self.position >= len(self.input) or not self.input[self.position].isdigit():292 raise JsonParseException("Invalid number format", start_line, start_column)293
294 # Parse integer part295 while self.position < len(self.input) and self.input[self.position].isdigit():296 result.append(self.input[self.position])297 self._advance()298
299 # Parse decimal part300 if self.position < len(self.input) and self.input[self.position] == '.':301 has_decimal = True302 result.append('.')303 self._advance()304 if self.position >= len(self.input) or not self.input[self.position].isdigit():305 raise JsonParseException("Invalid number format: expected digit after decimal point", self.line, self.column)306 while self.position < len(self.input) and self.input[self.position].isdigit():307 result.append(self.input[self.position])308 self._advance()309
310 # Parse exponent part311 if self.position < len(self.input) and self.input[self.position] in 'eE':312 result.append(self.input[self.position])313 self._advance()314 if self.position < len(self.input) and self.input[self.position] in '+-':315 result.append(self.input[self.position])316 self._advance()317 if self.position >= len(self.input) or not self.input[self.position].isdigit():318 raise JsonParseException("Invalid number format: expected digit in exponent", self.line, self.column)319 while self.position < len(self.input) and self.input[self.position].isdigit():320 result.append(self.input[self.position])321 self._advance()322
323 number_str = ''.join(result)324 try:325 if has_decimal:326 return Token(TokenType.NUMBER, float(number_str), start_line, start_column)327 else:328 return Token(TokenType.NUMBER, int(number_str), start_line, start_column)329 except ValueError:330 raise JsonParseException(f"Invalid number format: {number_str}", start_line, start_column)331
332 def get_current_line(self) -> int:333 return self.line334
335 def get_current_column(self) -> int:336 return self.column337
338# File: parser/json_parser.py339class JsonParser:340 def __init__(self, json_string: str):341 self.tokenizer = JsonTokenizer(json_string)342 self.current_token = None343
344 def parse(self) -> JsonValue:345 self.current_token = self.tokenizer.next_token()346 result = self._parse_value()347 if self.current_token.type != TokenType.EOF:348 raise JsonParseException(f"Unexpected token after JSON value: {self.current_token.type.value}",349 self.current_token.line, self.current_token.column)350 return result351
352 def _parse_value(self) -> JsonValue:353 if self.current_token.type == TokenType.LEFT_BRACE:354 return self._parse_object()355 elif self.current_token.type == TokenType.LEFT_BRACKET:356 return self._parse_array()357 elif self.current_token.type == TokenType.STRING:358 value = JsonString(self.current_token.value)359 self._consume_token()360 return value361 elif self.current_token.type == TokenType.NUMBER:362 value = JsonNumber(self.current_token.value)363 self._consume_token()364 return value365 elif self.current_token.type == TokenType.BOOLEAN_TRUE:366 self._consume_token()367 return JsonBoolean(True)368 elif self.current_token.type == TokenType.BOOLEAN_FALSE:369 self._consume_token()370 return JsonBoolean(False)371 elif self.current_token.type == TokenType.NULL:372 self._consume_token()373 return JsonNull()374 else:375 raise JsonParseException(f"Unexpected token: {self.current_token.type.value}",376 self.current_token.line, self.current_token.column)377
378 def _parse_object(self) -> JsonObject:379 self._consume_token(TokenType.LEFT_BRACE)380 obj = JsonObject()381
382 if self.current_token.type == TokenType.RIGHT_BRACE:383 self._consume_token()384 return obj385
386 while True:387 if self.current_token.type != TokenType.STRING:388 raise JsonParseException("Expected string key in object",389 self.current_token.line, self.current_token.column)390 key = self.current_token.value391 self._consume_token()392
393 self._consume_token(TokenType.COLON)394
395 value = self._parse_value()396 obj.put(key, value)397
398 if self.current_token.type == TokenType.RIGHT_BRACE:399 self._consume_token()400 break401 elif self.current_token.type == TokenType.COMMA:402 self._consume_token()403 else:404 raise JsonParseException("Expected ',' or '}' in object",405 self.current_token.line, self.current_token.column)406
407 return obj408
409 def _parse_array(self) -> JsonArray:410 self._consume_token(TokenType.LEFT_BRACKET)411 array = JsonArray()412
413 if self.current_token.type == TokenType.RIGHT_BRACKET:414 self._consume_token()415 return array416
417 while True:418 value = self._parse_value()419 array.add(value)420
421 if self.current_token.type == TokenType.RIGHT_BRACKET:422 self._consume_token()423 break424 elif self.current_token.type == TokenType.COMMA:425 self._consume_token()426 else:427 raise JsonParseException("Expected ',' or ']' in array",428 self.current_token.line, self.current_token.column)429
430 return array431
432 def _consume_token(self, expected_type: TokenType = None):433 if expected_type is not None:434 if self.current_token.type != expected_type:435 raise JsonParseException(f"Expected {expected_type.value} but found {self.current_token.type.value}",436 self.current_token.line, self.current_token.column)437 self.current_token = self.tokenizer.next_token()438
439# File: main.py440if __name__ == "__main__":441 # Test 1: Simple object442 json1 = '{"name": "John", "age": 30, "isActive": true}'443 parser1 = JsonParser(json1)444 result1 = parser1.parse()445 print("Test 1 - Simple Object:")446 print(f"Parsed: {result1}")447 print(f"Value: {result1.get_value()}")448 print()449
450 # Test 2: Nested object451 json2 = '{"person": {"name": "Alice", "age": 25}, "city": "New York"}'452 parser2 = JsonParser(json2)453 result2 = parser2.parse()454 print("Test 2 - Nested Object:")455 print(f"Parsed: {result2}")456 print()457
458 # Test 3: Array459 json3 = '[1, 2, 3, "hello", true, null]'460 parser3 = JsonParser(json3)461 result3 = parser3.parse()462 print("Test 3 - Array:")463 print(f"Parsed: {result3}")464 print(f"Value: {result3.get_value()}")1// File: exceptions/json_parse_exception.java2class JsonParseException extends Exception {3 private int line;4 private int column;5 private String message;6
7 public JsonParseException(String message, int line, int column) {8 super(String.format("JSON Parse Error at line %d, column %d: %s", line, column, message));9 this.message = message;10 this.line = line;11 this.column = column;12 }13
14 public int getLine() { return line; }15 public int getColumn() { return column; }16 public String getErrorMessage() { return message; }17}18
19// File: models/token_type.java20enum TokenType {21 LEFT_BRACE, RIGHT_BRACE, LEFT_BRACKET, RIGHT_BRACKET,22 COLON, COMMA, STRING, NUMBER, BOOLEAN_TRUE, BOOLEAN_FALSE, NULL, EOF23}24
25// File: models/token.java26class Token {27 private TokenType type;28 private Object value;29 private int line;30 private int column;31
32 public Token(TokenType type, Object value, int line, int column) {33 this.type = type;34 this.value = value;35 this.line = line;36 this.column = column;37 }38
39 public TokenType getType() { return type; }40 public Object getValue() { return value; }41 public int getLine() { return line; }42 public int getColumn() { return column; }43}44
45// File: models/json_value.java46abstract class JsonValue {47 public abstract Object getValue();48 public abstract String toString();49}50
51// File: models/json_string.java52class JsonString extends JsonValue {53 private String value;54
55 public JsonString(String value) {56 this.value = value;57 }58
59 @Override60 public Object getValue() {61 return value;62 }63
64 @Override65 public String toString() {66 return "\"" + escapeString(value) + "\"";67 }68
69 private String escapeString(String str) {70 StringBuilder sb = new StringBuilder();71 for (char c : str.toCharArray()) {72 switch (c) {73 case '"': sb.append("\\\""); break;74 case '\\': sb.append("\\\\"); break;75 case '\n': sb.append("\\n"); break;76 case '\r': sb.append("\\r"); break;77 case '\t': sb.append("\\t"); break;78 default: sb.append(c); break;79 }80 }81 return sb.toString();82 }83}84
85// File: models/json_number.java86class JsonNumber extends JsonValue {87 private Number value;88
89 public JsonNumber(Number value) {90 this.value = value;91 }92
93 @Override94 public Object getValue() {95 return value;96 }97
98 @Override99 public String toString() {100 return value.toString();101 }102}103
104// File: models/json_boolean.java105class JsonBoolean extends JsonValue {106 private boolean value;107
108 public JsonBoolean(boolean value) {109 this.value = value;110 }111
112 @Override113 public Object getValue() {114 return value;115 }116
117 @Override118 public String toString() {119 return value ? "true" : "false";120 }121}122
123// File: models/json_null.java124class JsonNull extends JsonValue {125 @Override126 public Object getValue() {127 return null;128 }129
130 @Override131 public String toString() {132 return "null";133 }134}135
136// File: models/json_object.java137class JsonObject extends JsonValue {138 private java.util.Map<String, JsonValue> properties;139
140 public JsonObject() {141 this.properties = new java.util.LinkedHashMap<>();142 }143
144 public void put(String key, JsonValue value) {145 properties.put(key, value);146 }147
148 public JsonValue get(String key) {149 return properties.get(key);150 }151
152 @Override153 public Object getValue() {154 java.util.Map<String, Object> result = new java.util.LinkedHashMap<>();155 for (java.util.Map.Entry<String, JsonValue> entry : properties.entrySet()) {156 result.put(entry.getKey(), entry.getValue().getValue());157 }158 return result;159 }160
161 @Override162 public String toString() {163 StringBuilder sb = new StringBuilder();164 sb.append("{");165 boolean first = true;166 for (java.util.Map.Entry<String, JsonValue> entry : properties.entrySet()) {167 if (!first) sb.append(", ");168 sb.append("\"").append(entry.getKey()).append("\": ").append(entry.getValue().toString());169 first = false;170 }171 sb.append("}");172 return sb.toString();173 }174}175
176// File: models/json_array.java177class JsonArray extends JsonValue {178 private java.util.List<JsonValue> elements;179
180 public JsonArray() {181 this.elements = new java.util.ArrayList<>();182 }183
184 public void add(JsonValue value) {185 elements.add(value);186 }187
188 @Override189 public Object getValue() {190 java.util.List<Object> result = new java.util.ArrayList<>();191 for (JsonValue value : elements) {192 result.add(value.getValue());193 }194 return result;195 }196
197 @Override198 public String toString() {199 StringBuilder sb = new StringBuilder();200 sb.append("[");201 boolean first = true;202 for (JsonValue value : elements) {203 if (!first) sb.append(", ");204 sb.append(value.toString());205 first = false;206 }207 sb.append("]");208 return sb.toString();209 }210}211
212// File: tokenizer/json_tokenizer.java213class JsonTokenizer {214 private String input;215 private int position;216 private int line;217 private int column;218
219 public JsonTokenizer(String input) {220 this.input = input;221 this.position = 0;222 this.line = 1;223 this.column = 1;224 }225
226 public Token nextToken() throws JsonParseException {227 skipWhitespace();228
229 if (position >= input.length()) {230 return new Token(TokenType.EOF, null, line, column);231 }232
233 char current = input.charAt(position);234 int startLine = line;235 int startColumn = column;236
237 switch (current) {238 case '{': advance(); return new Token(TokenType.LEFT_BRACE, null, startLine, startColumn);239 case '}': advance(); return new Token(TokenType.RIGHT_BRACE, null, startLine, startColumn);240 case '[': advance(); return new Token(TokenType.LEFT_BRACKET, null, startLine, startColumn);241 case ']': advance(); return new Token(TokenType.RIGHT_BRACKET, null, startLine, startColumn);242 case ':': advance(); return new Token(TokenType.COLON, null, startLine, startColumn);243 case ',': advance(); return new Token(TokenType.COMMA, null, startLine, startColumn);244 case '"': return parseString();245 case 't': return parseKeyword("true", TokenType.BOOLEAN_TRUE, startLine, startColumn);246 case 'f': return parseKeyword("false", TokenType.BOOLEAN_FALSE, startLine, startColumn);247 case 'n': return parseKeyword("null", TokenType.NULL, startLine, startColumn);248 case '-':249 case '0': case '1': case '2': case '3': case '4':250 case '5': case '6': case '7': case '8': case '9':251 return parseNumber();252 default:253 throw new JsonParseException("Unexpected character: " + current, startLine, startColumn);254 }255 }256
257 private void skipWhitespace() {258 while (position < input.length()) {259 char c = input.charAt(position);260 if (c == ' ' || c == '\t') {261 advance();262 } else if (c == '\n') {263 line++;264 column = 1;265 position++;266 } else if (c == '\r') {267 if (position + 1 < input.length() && input.charAt(position + 1) == '\n') {268 position += 2;269 } else {270 position++;271 }272 line++;273 column = 1;274 } else {275 break;276 }277 }278 }279
280 private void advance() {281 position++;282 column++;283 }284
285 private Token parseString() throws JsonParseException {286 int startLine = line;287 int startColumn = column;288 advance(); // skip opening quote289 StringBuilder sb = new StringBuilder();290
291 while (position < input.length()) {292 char c = input.charAt(position);293 if (c == '"') {294 advance();295 return new Token(TokenType.STRING, sb.toString(), startLine, startColumn);296 } else if (c == '\\') {297 advance();298 if (position >= input.length()) {299 throw new JsonParseException("Unterminated string", line, column);300 }301 char escaped = input.charAt(position);302 switch (escaped) {303 case '"': sb.append('"'); advance(); break;304 case '\\': sb.append('\\'); advance(); break;305 case '/': sb.append('/'); advance(); break;306 case 'b': sb.append('\b'); advance(); break;307 case 'f': sb.append('\f'); advance(); break;308 case 'n': sb.append('\n'); advance(); break;309 case 'r': sb.append('\r'); advance(); break;310 case 't': sb.append('\t'); advance(); break;311 case 'u':312 advance();313 if (position + 4 > input.length()) {314 throw new JsonParseException("Incomplete Unicode escape sequence", line, column);315 }316 String hex = input.substring(position, position + 4);317 try {318 int codePoint = Integer.parseInt(hex, 16);319 sb.append((char) codePoint);320 position += 4;321 column += 4;322 } catch (NumberFormatException e) {323 throw new JsonParseException("Invalid Unicode escape sequence: \\u" + hex, line, column);324 }325 break;326 default:327 throw new JsonParseException("Invalid escape sequence: \\" + escaped, line, column);328 }329 } else if (c == '\n' || c == '\r') {330 throw new JsonParseException("Unterminated string: newline in string", line, column);331 } else {332 sb.append(c);333 advance();334 }335 }336
337 throw new JsonParseException("Unterminated string", line, column);338 }339
340 private Token parseKeyword(String keyword, TokenType type, int startLine, int startColumn) throws JsonParseException {341 if (position + keyword.length() > input.length()) {342 throw new JsonParseException("Unexpected end of input", startLine, startColumn);343 }344 String actual = input.substring(position, position + keyword.length());345 if (!actual.equals(keyword)) {346 throw new JsonParseException("Unexpected token: " + actual, startLine, startColumn);347 }348 position += keyword.length();349 column += keyword.length();350 return new Token(type, null, startLine, startColumn);351 }352
353 private Token parseNumber() throws JsonParseException {354 int startLine = line;355 int startColumn = column;356 StringBuilder sb = new StringBuilder();357 boolean hasDecimal = false;358
359 if (input.charAt(position) == '-') {360 sb.append('-');361 advance();362 }363
364 if (position >= input.length() || !Character.isDigit(input.charAt(position))) {365 throw new JsonParseException("Invalid number format", startLine, startColumn);366 }367
368 // Parse integer part369 while (position < input.length() && Character.isDigit(input.charAt(position))) {370 sb.append(input.charAt(position));371 advance();372 }373
374 // Parse decimal part375 if (position < input.length() && input.charAt(position) == '.') {376 hasDecimal = true;377 sb.append('.');378 advance();379 if (position >= input.length() || !Character.isDigit(input.charAt(position))) {380 throw new JsonParseException("Invalid number format: expected digit after decimal point", line, column);381 }382 while (position < input.length() && Character.isDigit(input.charAt(position))) {383 sb.append(input.charAt(position));384 advance();385 }386 }387
388 // Parse exponent part389 if (position < input.length() && (input.charAt(position) == 'e' || input.charAt(position) == 'E')) {390 sb.append(input.charAt(position));391 advance();392 if (position < input.length() && (input.charAt(position) == '+' || input.charAt(position) == '-')) {393 sb.append(input.charAt(position));394 advance();395 }396 if (position >= input.length() || !Character.isDigit(input.charAt(position))) {397 throw new JsonParseException("Invalid number format: expected digit in exponent", line, column);398 }399 while (position < input.length() && Character.isDigit(input.charAt(position))) {400 sb.append(input.charAt(position));401 advance();402 }403 }404
405 String numberStr = sb.toString();406 try {407 if (hasDecimal) {408 return new Token(TokenType.NUMBER, Double.parseDouble(numberStr), startLine, startColumn);409 } else {410 long longValue = Long.parseLong(numberStr);411 if (longValue >= Integer.MIN_VALUE && longValue <= Integer.MAX_VALUE) {412 return new Token(TokenType.NUMBER, (int) longValue, startLine, startColumn);413 } else {414 return new Token(TokenType.NUMBER, longValue, startLine, startColumn);415 }416 }417 } catch (NumberFormatException e) {418 throw new JsonParseException("Invalid number format: " + numberStr, startLine, startColumn);419 }420 }421
422 public int getCurrentLine() { return line; }423 public int getCurrentColumn() { return column; }424}425
426// File: parser/json_parser.java427class JsonParser {428 private JsonTokenizer tokenizer;429 private Token currentToken;430
431 public JsonParser(String jsonString) {432 this.tokenizer = new JsonTokenizer(jsonString);433 }434
435 public JsonValue parse() throws JsonParseException {436 currentToken = tokenizer.nextToken();437 JsonValue result = parseValue();438 if (currentToken.getType() != TokenType.EOF) {439 throw new JsonParseException("Unexpected token after JSON value: " + currentToken.getType(),440 currentToken.getLine(), currentToken.getColumn());441 }442 return result;443 }444
445 private JsonValue parseValue() throws JsonParseException {446 switch (currentToken.getType()) {447 case LEFT_BRACE:448 return parseObject();449 case LEFT_BRACKET:450 return parseArray();451 case STRING:452 JsonString str = new JsonString((String) currentToken.getValue());453 consumeToken();454 return str;455 case NUMBER:456 JsonNumber num = new JsonNumber((Number) currentToken.getValue());457 consumeToken();458 return num;459 case BOOLEAN_TRUE:460 consumeToken();461 return new JsonBoolean(true);462 case BOOLEAN_FALSE:463 consumeToken();464 return new JsonBoolean(false);465 case NULL:466 consumeToken();467 return new JsonNull();468 default:469 throw new JsonParseException("Unexpected token: " + currentToken.getType(),470 currentToken.getLine(), currentToken.getColumn());471 }472 }473
474 private JsonObject parseObject() throws JsonParseException {475 consumeToken(TokenType.LEFT_BRACE);476 JsonObject obj = new JsonObject();477
478 if (currentToken.getType() == TokenType.RIGHT_BRACE) {479 consumeToken();480 return obj;481 }482
483 while (true) {484 if (currentToken.getType() != TokenType.STRING) {485 throw new JsonParseException("Expected string key in object",486 currentToken.getLine(), currentToken.getColumn());487 }488 String key = (String) currentToken.getValue();489 consumeToken();490
491 consumeToken(TokenType.COLON);492
493 JsonValue value = parseValue();494 obj.put(key, value);495
496 if (currentToken.getType() == TokenType.RIGHT_BRACE) {497 consumeToken();498 break;499 } else if (currentToken.getType() == TokenType.COMMA) {500 consumeToken();501 } else {502 throw new JsonParseException("Expected ',' or '}' in object",503 currentToken.getLine(), currentToken.getColumn());504 }505 }506
507 return obj;508 }509
510 private JsonArray parseArray() throws JsonParseException {511 consumeToken(TokenType.LEFT_BRACKET);512 JsonArray array = new JsonArray();513
514 if (currentToken.getType() == TokenType.RIGHT_BRACKET) {515 consumeToken();516 return array;517 }518
519 while (true) {520 JsonValue value = parseValue();521 array.add(value);522
523 if (currentToken.getType() == TokenType.RIGHT_BRACKET) {524 consumeToken();525 break;526 } else if (currentToken.getType() == TokenType.COMMA) {527 consumeToken();528 } else {529 throw new JsonParseException("Expected ',' or ']' in array",530 currentToken.getLine(), currentToken.getColumn());531 }532 }533
534 return array;535 }536
537 private void consumeToken() throws JsonParseException {538 currentToken = tokenizer.nextToken();539 }540
541 private void consumeToken(TokenType expectedType) throws JsonParseException {542 if (currentToken.getType() != expectedType) {543 throw new JsonParseException("Expected " + expectedType + " but found " + currentToken.getType(),544 currentToken.getLine(), currentToken.getColumn());545 }546 consumeToken();547 }548}549
550// File: main.java551public class Main {552 public static void main(String[] args) {553 try {554 String json1 = "{\"name\": \"John\", \"age\": 30, \"isActive\": true}";555 JsonParser parser1 = new JsonParser(json1);556 JsonValue result1 = parser1.parse();557 System.out.println("Test 1 - Simple Object:");558 System.out.println("Parsed: " + result1.toString());559 System.out.println("Value: " + result1.getValue());560 } catch (JsonParseException e) {561 System.err.println("Parse Error: " + e.getMessage());562 }563 }564}Key Implementation Highlights
Section titled “Key Implementation Highlights”1. Separation of Concerns:
JsonTokenizerhandles lexical analysis (tokenization)JsonParserhandles syntactic analysis (parsing)- Clear separation makes code maintainable and testable
2. Recursive Descent Parsing:
parseValue()callsparseObject()orparseArray()recursively- Natural fit for JSON’s hierarchical structure
- Handles arbitrary nesting depth
3. Position Tracking:
- Tokenizer tracks line and column numbers
- Errors include precise position information
- Essential for debugging large JSON files
4. Unicode Support:
- Handles escape sequences (
\n,\t,\r, etc.) - Supports Unicode escapes (
\uXXXX) - Validates escape sequences properly
5. Error Handling:
- Custom exception with position information
- Clear error messages for different failure scenarios
- Helps developers debug invalid JSON quickly
Why Separation of Concerns?
Section titled “Why Separation of Concerns?”Problem: Parsing involves two distinct phases - breaking input into tokens and building structure from tokens.
Without Separation:
1class JsonParser:2 def parse(self, json_string):3 # Mixing tokenization and parsing ❌4 i = 05 while i < len(json_string):6 if json_string[i] == '{':7 # Parse object while also tokenizing8 # Hard to test, hard to maintain1class JsonParser {2 public JsonValue parse(String jsonString) {3 // Mixing tokenization and parsing ❌4 int i = 0;5 while (i < jsonString.length()) {6 if (jsonString.charAt(i) == '{') {7 // Parse object while also tokenizing8 // Hard to test, hard to maintainWith Separation:
1class JsonTokenizer:2 def next_token(self):3 # Only responsible for tokenization ✅4 # Easy to test independently5 pass6
7class JsonParser:8 def __init__(self, json_string):9 self.tokenizer = JsonTokenizer(json_string)10
11 def parse(self):12 # Only responsible for parsing ✅13 # Uses tokens from tokenizer14 token = self.tokenizer.next_token()15 # Parse using tokens1class JsonTokenizer {2 public Token nextToken() {3 // Only responsible for tokenization ✅4 // Easy to test independently5 }6}7
8class JsonParser {9 private JsonTokenizer tokenizer;10
11 public JsonParser(String jsonString) {12 this.tokenizer = new JsonTokenizer(jsonString);13 }14
15 public JsonValue parse() {16 // Only responsible for parsing ✅17 // Uses tokens from tokenizer18 Token token = tokenizer.nextToken();19 // Parse using tokens20 }21}Benefits:
- Test tokenization independently
- Test parsing independently
- Easier to maintain and debug
- Follows Single Responsibility Principle
Why Recursive Descent Parsing?
Section titled “Why Recursive Descent Parsing?”Problem: JSON has recursive structure - objects and arrays can contain other objects and arrays.
Without Recursion:
1class JsonParser:2 def parse(self, json_string):3 # Need explicit stack to handle nesting ❌4 stack = []5 depth = 06 # Complex state management7 # Hard to understand and maintain1class JsonParser {2 public JsonValue parse(String jsonString) {3 // Need explicit stack to handle nesting ❌4 Stack<Object> stack = new Stack<>();5 int depth = 0;6 // Complex state management7 // Hard to understand and maintainWith Recursive Descent:
1class JsonParser:2 def _parse_value(self):3 # Natural recursion ✅4 if token.type == LEFT_BRACE:5 return self._parse_object() # Calls parse_value recursively6
7 def _parse_object(self):8 # Calls parse_value for nested values9 value = self._parse_value() # Recursive call1class JsonParser {2 private JsonValue parseValue() {3 // Natural recursion ✅4 if (currentToken.getType() == LEFT_BRACE) {5 return parseObject(); // Calls parseValue recursively6 }7 }8
9 private JsonObject parseObject() {10 // Calls parseValue for nested values11 JsonValue value = parseValue(); // Recursive call12 }13}Benefits:
- Natural fit for hierarchical structures
- Easy to understand and implement
- Code mirrors JSON structure
- Handles arbitrary nesting depth
Why Polymorphic JsonValue Hierarchy?
Section titled “Why Polymorphic JsonValue Hierarchy?”Problem: Need to represent different JSON types with common interface.
Without Polymorphism:
1# Need type checking everywhere ❌2if isinstance(value, dict):3 # Handle object4elif isinstance(value, list):5 # Handle array6elif isinstance(value, str):7 # Handle string8# Type checking scattered everywhere1// Need type checking everywhere ❌2if (value instanceof Map) {3 // Handle object4} else if (value instanceof List) {5 // Handle array6} else if (value instanceof String) {7 // Handle string8}9// Type checking scattered everywhereWith Polymorphism:
1class JsonValue(ABC):2 @abstractmethod3 def get_value(self):4 pass5
6# All types implement same interface ✅7value.get_value() # Works for all types8# No type checking needed1abstract class JsonValue {2 public abstract Object getValue();3}4
5// All types implement same interface ✅6value.getValue(); // Works for all types7// No type checking neededBenefits:
- Type safety
- Extensible - easy to add new types
- Clean interface
- No scattered type checks
System Flow Diagrams
Section titled “System Flow Diagrams”Parsing Flow
Section titled “Parsing Flow”Tokenization Flow
Section titled “Tokenization Flow”String Parsing with Escapes
Section titled “String Parsing with Escapes”flowchart TD
A[Start parseString] --> B[Skip opening quote]
B --> C{Character?}
C -->|"| D[End string, return token]
C -->|\\| E[Escape sequence]
C -->|Newline| F[Error: newline in string]
C -->|Other| G[Add to result]
E --> H{Escape type?}
H -->|\\" \\\\ \\/| I[Single char]
H -->|\\n \\t \\r| J[Control char]
H -->|\\u| K[Unicode]
H -->|Other| L[Error: invalid escape]
K --> M[Read 4 hex digits]
M --> N{Valid hex?}
N -->|Yes| O[Convert to char]
N -->|No| P[Error: invalid Unicode]
I --> G
J --> G
O --> G
G --> C
style D fill:#10b981
style F fill:#ef4444
style L fill:#ef4444
style P fill:#ef4444
Extensibility & Future Enhancements
Section titled “Extensibility & Future Enhancements”Easy to Extend
Section titled “Easy to Extend”1. Add JSON Schema Validation:
1class JsonSchemaValidator:2 def validate(self, json_value: JsonValue, schema: dict) -> bool:3 """Validate JSON value against schema."""4 if isinstance(json_value, JsonObject):5 return self._validate_object(json_value, schema)6 # ... other validations7
8# Use it9parser = JsonParser(json_string)10value = parser.parse()11validator = JsonSchemaValidator()12if validator.validate(value, schema):13 print("Valid!")1class JsonSchemaValidator {2 public boolean validate(JsonValue jsonValue, Map<String, Object> schema) {3 // Validate JSON value against schema4 if (jsonValue instanceof JsonObject) {5 return validateObject((JsonObject) jsonValue, schema);6 }7 // ... other validations8 }9}10
11// Use it12JsonParser parser = new JsonParser(jsonString);13JsonValue value = parser.parse();14JsonSchemaValidator validator = new JsonSchemaValidator();15if (validator.validate(value, schema)) {16 System.out.println("Valid!");17}2. Add Pretty Printing:
1class JsonPrettyPrinter:2 def print(self, json_value: JsonValue, indent: int = 2) -> str:3 """Format JSON with indentation."""4 if isinstance(json_value, JsonObject):5 return self._print_object(json_value, indent, 0)6 # ... other types7
8# Use it9parser = JsonParser(json_string)10value = parser.parse()11printer = JsonPrettyPrinter()12print(printer.print(value))1class JsonPrettyPrinter {2 public String print(JsonValue jsonValue, int indent) {3 // Format JSON with indentation4 if (jsonValue instanceof JsonObject) {5 return printObject((JsonObject) jsonValue, indent, 0);6 }7 // ... other types8 }9}10
11// Use it12JsonParser parser = new JsonParser(jsonString);13JsonValue value = parser.parse();14JsonPrettyPrinter printer = new JsonPrettyPrinter();15System.out.println(printer.print(value, 2));3. Add Streaming Parser:
1class StreamingJsonParser:2 """Parse large JSON files without loading entire file."""3
4 def parse_stream(self, file_handle):5 """Parse JSON from file stream."""6 # Read chunks and parse incrementally7 # Useful for very large JSON files8 pass1class StreamingJsonParser {2 /**3 * Parse large JSON files without loading entire file.4 */5 public void parseStream(InputStream inputStream) {6 // Read chunks and parse incrementally7 // Useful for very large JSON files8 }9}Future Enhancements
Section titled “Future Enhancements”1. JSONPath Support:
- Query JSON using JSONPath expressions
- Example:
$.users[0].name
2. JSON Transformations:
- Transform JSON structure
- Filter, map, reduce operations
3. Custom Value Types:
- Extend JsonValue hierarchy
- Add support for custom types
4. Performance Optimizations:
- Streaming parser for large files
- Lazy parsing for large structures
- Memory-efficient implementations
5. Validation Enhancements:
- JSON Schema validation
- Custom validation rules
- Type checking
Summary
Section titled “Summary”Key Takeaways
Section titled “Key Takeaways”- Follow Systematic Approach - Don’t jump to code
- Separate Concerns - Tokenization vs Parsing
- Use Recursive Descent - Natural fit for hierarchical structures
- Assign Responsibilities - Single Responsibility Principle
- Design Class Diagrams - Visualize structure and relationships
- Handle Edge Cases - Invalid JSON, errors, edge cases
- Track Position - Line and column for error reporting
- Use Polymorphism - JsonValue hierarchy
- Separation of Concerns - Tokenizer vs Parser
- Recursive Descent Parsing - Natural recursion for nested structures
- Polymorphism - JsonValue hierarchy
- Template Method Pattern - JsonValue interface
Best Practices Demonstrated
Section titled “Best Practices Demonstrated”- SOLID principles (SRP, OCP)
- Clear separation of concerns
- Comprehensive error handling
- Position tracking for debugging
- Extensible design
- Clean code structure
Next Steps
Section titled “Next Steps”Now that you’ve mastered the JSON Parser:
Practice Similar Problems:
- XML Parser
- Expression Parser (calculator)
- Configuration File Parser
- Markdown Parser
Explore More Concepts:
Deepen Your Understanding:
- Study parser generators (ANTLR, Yacc)
- Learn about different parsing algorithms
- Explore AST (Abstract Syntax Tree) construction
- Study error recovery techniques