mirror of
https://github.com/huggingface/lerobot.git
synced 2026-05-23 12:40:08 +00:00
more changes
This commit is contained in:
@@ -0,0 +1,208 @@
|
||||
# Subtask Token Generation - Quick Reference
|
||||
|
||||
## What Was Done
|
||||
|
||||
Added **autoregressive subtask token generation** to PI05 model with decoding and printing during both training and inference.
|
||||
|
||||
## Key Features
|
||||
|
||||
✅ **Training:** Prints ground truth subtask tokens for monitoring
|
||||
✅ **Inference:** Generates and prints predicted subtask tokens using next token prediction
|
||||
✅ **Autoregressive:** Each token conditioned on previous tokens
|
||||
✅ **Greedy Decoding:** Selects most likely token at each step
|
||||
|
||||
## Implementation Location
|
||||
|
||||
**File:** `src/lerobot/policies/pi05/modeling_pi05.py`
|
||||
|
||||
**New Method:** `_generate_subtask_tokens()` (lines 844-914)
|
||||
- Autoregressive token generation
|
||||
- Uses PaliGemma language model head
|
||||
- Greedy decoding with early stopping
|
||||
|
||||
**Modified Methods:**
|
||||
- `sample_actions()` - Calls generation and prints during inference
|
||||
- `predict_action_chunk()` - Passes tokenizer to enable generation
|
||||
- `forward()` - Prints ground truth tokens during training
|
||||
- `__init__()` - Loads tokenizer
|
||||
|
||||
## Console Output Examples
|
||||
|
||||
### Training:
|
||||
```
|
||||
[Training] Ground truth subtask 0: pick up the red block
|
||||
[Training] Ground truth subtask 1: place in blue container
|
||||
```
|
||||
|
||||
### Inference:
|
||||
```
|
||||
[Inference] Generated subtask 0: grasp the object
|
||||
[Inference] Generated subtask 1: move to target location
|
||||
```
|
||||
|
||||
## How to Use
|
||||
|
||||
### No Code Changes Required!
|
||||
|
||||
The implementation is automatic:
|
||||
|
||||
1. **Training:** Just run your training script
|
||||
- Subtasks will be printed to console automatically
|
||||
|
||||
2. **Inference:** Just run your inference script
|
||||
- Subtasks will be generated and printed automatically
|
||||
|
||||
### To Disable (if needed):
|
||||
|
||||
To disable subtask generation during inference for better performance:
|
||||
|
||||
```python
|
||||
# In the model code, set tokenizer to None temporarily
|
||||
policy.tokenizer = None
|
||||
actions = policy.predict_action_chunk(batch)
|
||||
```
|
||||
|
||||
## Technical Specs
|
||||
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| **Generation Method** | Autoregressive (sequential) |
|
||||
| **Decoding Strategy** | Greedy (argmax) |
|
||||
| **Max Tokens** | 50 (configurable) |
|
||||
| **Tokenizer** | google/paligemma-3b-pt-224 |
|
||||
| **Attention** | Causal masking for generated tokens |
|
||||
| **Performance Cost** | ~50 extra forward passes per inference |
|
||||
|
||||
## Architecture Flow
|
||||
|
||||
```
|
||||
Training: Ground Truth Tokens → Decode → Print → Loss Computation
|
||||
Inference: Observations → Generate Tokens → Decode → Print → Action Prediction
|
||||
```
|
||||
|
||||
## Method: `_generate_subtask_tokens()`
|
||||
|
||||
**Purpose:** Generate subtask tokens autoregressively
|
||||
|
||||
**Algorithm:**
|
||||
```python
|
||||
1. Start with prefix = [images, high-level task, state]
|
||||
2. For each position (up to max_length):
|
||||
a. Forward pass → get logits
|
||||
b. Apply LM head → token probabilities
|
||||
c. Select best token (greedy)
|
||||
d. Embed token
|
||||
e. Append to prefix
|
||||
f. Update masks (causal attention)
|
||||
3. Stop when EOS or max length reached
|
||||
4. Return generated tokens
|
||||
```
|
||||
|
||||
**Key Parameters:**
|
||||
- `images` - Visual observations
|
||||
- `img_masks` - Image padding masks
|
||||
- `tokens` - Instruction tokens with state
|
||||
- `masks` - Token attention masks
|
||||
- `tokenizer` - For EOS detection
|
||||
- `max_length` - Maximum tokens to generate (default: 50)
|
||||
- `device` - Computation device
|
||||
|
||||
## Files Created
|
||||
|
||||
📄 `SUMMARY.md` - Comprehensive summary
|
||||
📄 `SUBTASK_GENERATION_CHANGES.md` - Detailed technical docs
|
||||
📄 `SUBTASK_GENERATION_FLOW.md` - Visual flow diagrams
|
||||
📄 `QUICK_REFERENCE.md` - This file
|
||||
📄 `examples/dataset/test_subtask_generation.py` - Test script
|
||||
|
||||
## Quick Test
|
||||
|
||||
```bash
|
||||
# Test that tokenizer loads correctly
|
||||
python examples/dataset/test_subtask_generation.py
|
||||
|
||||
# Run training to see ground truth subtasks
|
||||
python your_training_script.py
|
||||
|
||||
# Run inference to see generated subtasks
|
||||
python your_inference_script.py
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### No subtask output during inference?
|
||||
- Check that tokenizer loaded: `print(policy.tokenizer)`
|
||||
- Should see: `PaliGemmaTokenizerFast(name_or_path='google/paligemma-3b-pt-224'...)`
|
||||
|
||||
### Tokenizer failed to load?
|
||||
- Check internet connection (first run downloads tokenizer)
|
||||
- Check transformers library installed: `pip install transformers`
|
||||
|
||||
### Performance too slow during inference?
|
||||
- Disable subtask generation by setting `policy.tokenizer = None`
|
||||
- Or implement KV caching for faster generation (future optimization)
|
||||
|
||||
## Integration Points
|
||||
|
||||
The implementation integrates seamlessly with existing code:
|
||||
|
||||
- **Training Loop:** No changes needed, prints happen automatically
|
||||
- **Inference Loop:** No changes needed, generation happens automatically
|
||||
- **Data Processing:** Uses existing tokenizer from processor
|
||||
- **Loss Computation:** Already implemented in training forward pass
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
Possible improvements (not yet implemented):
|
||||
|
||||
- [ ] KV caching for faster generation
|
||||
- [ ] Temperature/top-k/top-p sampling
|
||||
- [ ] Beam search for better quality
|
||||
- [ ] Optional flag to enable/disable printing
|
||||
- [ ] Save generated subtasks to file
|
||||
- [ ] Compute subtask prediction accuracy metrics
|
||||
- [ ] Use generated subtasks in action prediction (hierarchical)
|
||||
|
||||
## Code Snippet - How Autoregressive Generation Works
|
||||
|
||||
```python
|
||||
# Simplified pseudocode
|
||||
generated_tokens = []
|
||||
prefix = [images, high_level_task, state]
|
||||
|
||||
for t in range(max_length):
|
||||
# Forward pass
|
||||
logits = model(prefix)
|
||||
|
||||
# Greedy decode
|
||||
next_token = argmax(logits[-1])
|
||||
|
||||
# Store
|
||||
generated_tokens.append(next_token)
|
||||
|
||||
# Stop if EOS
|
||||
if next_token == EOS:
|
||||
break
|
||||
|
||||
# Append for next iteration
|
||||
prefix = prefix + [next_token]
|
||||
|
||||
return generated_tokens
|
||||
```
|
||||
|
||||
## Questions?
|
||||
|
||||
See the detailed documentation files:
|
||||
- `SUBTASK_GENERATION_CHANGES.md` - Full technical details
|
||||
- `SUBTASK_GENERATION_FLOW.md` - Visual flow diagrams
|
||||
- `SUMMARY.md` - Complete overview
|
||||
|
||||
---
|
||||
|
||||
**Implementation Status:** ✅ Complete and Ready to Use
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user