Fix CSV parsing to use proper csv.reader() instead of naive string splitting #104
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The
parse_csv_raw()
function was using naive string splitting to parse CSV headers, which doesn't handle CSV edge cases properly. This caused incorrect parsing when CSV files contained quoted fields with commas, escaped quotes, or other standard CSV formatting.Problem
The current implementation used:
This fails for CSV files like:
The naive splitting would incorrectly parse this as 6 columns instead of 4:
Solution
Updated the function to use Python's standard
csv
module:This properly handles all CSV edge cases and parses the same header correctly as:
Changes
parse_csv_raw()
function inoutrank/core_utils.py
(line 396)test_parse_csv_with_quoted_fields()
intests/data_io_test.py
The fix is surgical and minimal - only one line changed. The
csv
module was already imported and used elsewhere in the codebase (parse_ob_csv_line()
function), so this change brings consistency across all CSV parsing methods.All existing tests pass and the new test case verifies that complex CSV files with quoted fields are now parsed correctly.
Fixes #103.
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.