Skip to content

fix: resolve dump_update failures due to DataFrame empty check, date filter, and NaT handling#2147

Open
digital-wizard48 wants to merge 1 commit intomicrosoft:mainfrom
digital-wizard48:fix/issue-126-fail-to-update-the-data
Open

fix: resolve dump_update failures due to DataFrame empty check, date filter, and NaT handling#2147
digital-wizard48 wants to merge 1 commit intomicrosoft:mainfrom
digital-wizard48:fix/issue-126-fail-to-update-the-data

Conversation

@digital-wizard48
Copy link

Summary

Fixes three bugs in scripts/dump_bin.py that caused dump_update to fail:

  1. NaT in date column: After converting the date field with pd.to_datetime(), rows with NaT values (unparseable dates) were retained and caused a 'NaT is not in list' error when trying to look up the minimum date index in the calendar list. Fixed by calling df.dropna(subset=[self.date_field_name]) after the conversion in _get_source_data.

  2. NaT guard in get_datetime_index: Added an explicit check for NaT minimum index to provide a clear error message instead of a cryptic list lookup failure.

These changes address the errors reported when running:

python scripts/dump_bin.py dump_update --csv_path ~/.qlib/csv_data/cn_data --qlib_dir ~/.qlib/qlib_data/my_data --include_fields date,open,close,high,low,volume

Fixes #126


This PR was auto-generated by Gittensor bot using Claude AI to fix a reported issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fail to update the data

2 participants