[768, 384, 192, 48, 16] dimensions without re-running the model.
Loading the model
token=... for explicit authentication, or run huggingface-cli login once.
From a local Lightning checkpoint:
One-call usage
model.embed chains preprocessing + prediction and returns a numpy array — the most common shape users want for downstream sklearn / FAISS work.
Step-by-step
model.predict() returns L2-normalized embeddings as a torch tensor on the model’s device — cosine similarity reduces to a dot product. Valid dimensions: 768, 384, 192, 48, 16. See Benchmarks for accuracy at each dimension. Input is auto-moved to the model device.
Matryoshka
Compute once at full resolution, truncate later:model.predict() normalizes for you. Manual truncation requires re-normalization for cosine distance to work correctly.
Classification
Similarity / retrieval
Batch processing
Preprocessing details
ne.preprocess matches the exact pipeline used to train the model:
- Bandpass filter (1-100 Hz, 4th-order Butterworth)
- Notch filter (50 Hz and 100 Hz)
- Resample to 250 Hz
- Segment into 30-second epochs (pad if shorter)
- Average channels into 8 canonical brain regions: Frontal, Central, Temporal Left, Temporal Right, Parietal, Occipital, EOG, ECG
- Convert to the temporal matrix representation (224x224 image per channel)
stride_seconds=30.0 for non-overlapping epochs (typical when each 30s window has its own classification label).