nr-catalog-tools — Architectural Overview¶
What it Does¶
This package provides a stable, unified Python interface to three public NR binary-black-hole waveform catalogs — SXS (SpEC), RIT (LazEv), and MAYA/GT (MayaKranc) — serving three overlapping communities:
- LIGO-Virgo-KAGRA analyses: PyCBC-compatible waveform time series and parameter dicts with consistent physical units and epoch conventions across all catalogs, for use in injection studies, template banks, and parameter estimation
- Waveform modeling: frame-alignment tools (Wigner D-matrix rotation,
f_lowerextraction, surrogate mode rotation) for calibrating and validating EOB, phenomenological, and surrogate models against any NR catalog - Cross-catalog accuracy studies: noise-weighted mismatch minimization over SO(3) rotations, time/phase shifts, and BMS supertranslations to quantify intrinsic NR catalog errors
All three backends implement an identical interface so analysis code is catalog-agnostic.
Class Hierarchy¶
CatalogABC (catalog.py) ← abstract base interface
└── CatalogBase (catalog.py) ← shared get()/get_parameters() with internal dict simulations registry
├── RITCatalog (rit.py) ← web-scraped .txt metadata; HDF5 / tar.gz data
├── SXSCatalog (sxs.py) ← delegates to new sxs.Simulations (Zenodo-backed)
└── MayaCatalog (maya.py) ← pickle metadata; HDF5 data via mayawaves
sxs.WaveformModes (ndarray subclass)
└── WaveformModes (waveform.py) ← adds physical unit scaling, frame rotation, matching
Key Design Decisions¶
-
CatalogBase.get()is the central dispatch (catalog.md). It handles download-on-demand for RIT/MAYA, then callsWaveformModes.load_from_h5().SXSCatalogoverridesget()entirely because SXS data goes throughsxs.load()/ Zenodo, not local HDF5. -
Lazy path resolution for SXS (sxs.md). All path columns are stub empty strings at catalog-load time. Resolving real paths for all ~2000 SXS simulations would require ~2000
sxs.load()calls. Actual file access is deferred toget(). -
_filepathas per-instance attribute (waveform.md). Extracted fromw_attributesbefore passing to thesxs.WaveformModesparent__new__, preventing class-level sharing where loading a second simulation would silently overwrite the first object's path. -
sxsmemoryview → numpy wrapping (waveform.md).sxs.WaveformModes.datamay return a memoryview (not a writable numpy array). All arithmetic wraps it withnp.array(..., dtype=complex). -
delta_tdual convention (waveform.md). Values> 1/128are dimensionless M units (NR native);≤ 1/128are physical seconds. The returnedTimeSeries.delta_tis always in seconds.
Data Flows¶
RIT load path¶
RITCatalog.load()
→ RITCatalogHelper.read_metadata_df_from_disk() # ~/.cache/RIT/metadata/metadata.csv
→ [scrape web if missing]
→ RITCatalog.get(sim_name)
→ RITCatalogHelper.download_waveform_data() # ExtrapStrain_RIT-BBH-XXXX-nYYY.h5
→ WaveformModes.load_from_h5()
reads amp_l{l}_m{m}/X,Y + phase_l{l}_m{m}/X,Y
interpolates all modes onto common uniform grid
returns complex (n_times, n_modes) array
SXS load path¶
SXSCatalog.load()
→ sxs.load("simulations", download=None)
→ SXSCatalog.get(sim_name)
→ sxs.load(sim_name, extrapolation="N{n}", auto_supersede=True) # Zenodo-backed
→ sim_obj.strain # sxs.WaveformModes
→ WaveformModes(raw_obj.data, raw_obj.time, ...) # thin wrapper
MAYA load path¶
MayaCatalog.load()
→ download MAYAmetadata.pkl → catalog.zip # ~/.cache/MAYA/
→ parse pickle → DataFrame → simulations dict
→ MayaCatalog.get(sim_name)
→ download GT{ID}.h5
→ WaveformModes.load_from_h5()
WaveformModes Core Methods¶
| Method | What it does |
|---|---|
load_from_h5() |
HDF5 → complex (n_times, n_modes); interpolates amp+phase onto uniform grid |
load_from_targz() |
ASCII .asc/.dat/.txt in tar.gz → same output |
get_mode(l, m, M, D, dt) |
Single mode in physical units; epoch set at (2,2) peak |
f_lower_at_1Msun(t) |
Instantaneous GW freq (Hz @ 1 M☉) from (2,2); divide by M for physical |
get_td_waveform(M, D, iota, phi, dt) |
Sky-averaged h₊+ih× summed over all modes |
trim_to_relaxation_time(M) |
(2,2) mode starting at relaxation epoch |
rotated(R) |
Wigner D-matrix rotation of all modes (inherited + overridden) |
match_sphere_averaged(other, psd, f_lower) |
Mismatch minimized over t_c, φ_c, R∈SO(3) via differential evolution |
match_sphere_averaged_bms_maximized(...) |
Same + BMS supertranslation optimization via spin-weighted Gaunt coefficients (scri) |
Metadata Normalization (metadata.md)¶
Catalog-specific keys are normalized to PyCBC-compatible output in get_source_parameters_from_metadata():
catalog_type value |
Input keys | Output keys |
|---|---|---|
"RIT" |
relaxed-chi1x/y/z, freq-start-22 |
spin1x/y/z, f_lower |
"MAYA" |
a1x/y/z, omega_orbital |
spin1x/y/z, f_lower |
"SXS" |
reference_dimensionless_spin1/2, reference_orbital_frequency |
spin1x/y/z, f_lower |
RIT keys: raw metadata text files use hyphens (relaxed-chi1z); the internal scraper retains them, and they are accessed directly using hyphens.