Experimental fork of the Deforum extension for Stable Diffusion WebUI Forge, fix’d up to work with Flux.1, integrate Parseq keyframe redistribution logic, and support Wan 2.1 AI Video Generation. Integrates dynamic camera shake effects with data sourced from EatTheFutures ‘Camera Shakify’ Blender plugin.
This fork of the extension is basically working.
⚠️ Configurations that use the experimental render core by enabling the new keyframe distribution feature, may require that some unsupported features are being kept disabled.
Install, update and run the ‘one-click installation package’ of Stable Diffusion WebUI Forge as described. Includes:
Other versions may work with this extension, but have not been properly tested.
Get flux1-dev-bnb-nf4-v2.safetensors from huggingface and put it into your <forge_install_dir>/models/Stable-diffusion/Flux:
https://huggingface.co/lllyasviel/flux1-dev-bnb-nf4/blob/main/flux1-dev-bnb-nf4-v2.safetensors
Get the following 3 files from huggingface and put them into <forge_install_dir>/models/VAE
ae.safetensors https://huggingface.co/black-forest-labs/FLUX.1-schnell/tree/mainclip_l.safetensors https://huggingface.co/comfyanonymous/flux_text_encoders/tree/maint5xxl_fp16.safetensors https://huggingface.co/comfyanonymous/flux_text_encoders/tree/mainRestart Forge, set mode to “flux”, select the flux checkpoint and all the 3 VAEs in “VAE / Text Encoder” and test with Txt2Img.
Go to tab “Extensions” - “Install from URL” and use this: https://github.com/Tok/sd-forge-deforum.git
Open commandline and run <forge_install_dir>/venv/Scripts/activate.bat
to activate the virtual environment (venv) for Python used by Forge.
With the venv from Forge activated, do:
cd <forge_install_dir>/extensions
git clone https://github.com/Tok/sd-forge-deforum
cd sd-forge-deforum
pip install -r requirements.txt
Get the latest default-settings.txt and place it directly into your ‘webui’ directory, then click “Load All Settings”:
https://raw.githubusercontent.com/Tok/sd-forge-deforum/main/scripts/default_settings.txt
Rename it to deforum_settings.txt (or whatever matches the name of your settings file in the UI) and put it directly into your ‘webui’ directory.
⚠️ Some Settings are currently not properly loaded or are not persisted
in default_settings.txt and may need to be set manually the first time:
Recommendation: Use ForgeUIs “Settings” - “Defaults” to save your settings.
The extension includes Wan 2.1 (Alibaba’s state-of-the-art video generation model) fully integrated with Deforum’s scheduling system for frame-perfect video creation.
# Recommended: VACE 1.3B model (8GB+ VRAM) - All-in-one T2V+I2V
huggingface-cli download Wan-AI/Wan2.1-VACE-1.3B --local-dir models/wan
# High Quality: VACE 14B model (16GB+ VRAM) - All-in-one T2V+I2V
huggingface-cli download Wan-AI/Wan2.1-VACE-14B --local-dir models/wan
# Alternative: Separate T2V models (no I2V chaining)
huggingface-cli download Wan-AI/Wan2.1-T2V-1.3B --local-dir models/wan
# Legacy: Separate I2V models (for compatibility with older setups)
huggingface-cli download Wan-AI/Wan2.1-I2V-1.3B --local-dir models/wan
huggingface-cli download Wan-AI/Wan2.1-I2V-14B --local-dir models/wan
models/qwen/ when first used:
# Models are automatically downloaded when "Enhance Prompts" is clicked
# Storage location: webui-forge/webui/models/qwen/
# Auto-selected based on your VRAM: 3B (4GB), 7B (8GB), 14B (16GB+)
{
"0": "mountain landscape",
"30": "misty valley",
"60": "golden sunlight",
"90": "illuminated peaks"
}
{
"0": "A breathtaking mountain landscape at dawn, with towering snow-capped peaks rising majestically against a pristine azure sky, with camera movement with slow right pan, forward dolly",
"30": "Morning mist gracefully rising from the valleys below, creating ethereal wisps that dance between ancient pine trees, with camera movement with medium left pan, upward tilt",
"60": "Golden sunlight breaking through dramatic cloud formations, casting warm amber rays across the rugged terrain and illuminating every crevice, with camera movement with fast zoom in, clockwise roll",
"90": "Full daylight illuminating the magnificent peaks in all their glory, revealing intricate details of rock formations and alpine meadows, with camera movement with slow backward dolly, downward pitch"
}
| Model | VRAM | Type | Description | Best For |
|---|---|---|---|---|
| QwenVL2.5_3B | 8GB | Vision+Text | Fast, supports images | Quick enhancement |
| QwenVL2.5_7B | 16GB | Vision+Text | Balanced quality | Most users ⭐ |
| Qwen2.5_3B | 6GB | Text-only | Memory efficient | Low-VRAM systems |
| Qwen2.5_7B | 14GB | Text-only | High quality | Text enhancement |
| Qwen2.5_14B | 28GB | Text-only | Maximum quality | High-end systems |
Auto-Selection Logic: The system automatically chooses the best model for your VRAM:
The system translates complex Deforum schedules into human-readable descriptions with frame-specific analysis:
| Deforum Schedule | AI Translation |
|---|---|
translation_x: "0:(0), 30:(100)" |
“camera movement with moderate panning right (extended)” |
translation_z: "0:(0), 60:(-50)" |
“camera movement with gentle dolly backward (sustained)” |
rotation_3d_y: "0:(0), 45:(20)" |
“camera movement with subtle rotating right (extended)” |
zoom: "0:(1.0), 30:(1.5)" |
“camera movement with moderate zooming in (brief)” |
Frame-Specific Analysis: Each prompt gets unique movement descriptions based on its position in the video timeline:
Camera Shakify Integration: When enabled, the system analyzes the actual Camera Shakify pattern at each frame position to provide varied, specific directional descriptions instead of generic “investigative handheld camera movement” text.
Combined Example:
Input: translation_x: "0:(0), 30:(100)", rotation_3d_x: "0:(0), 60:(15)", zoom: "0:(1.0), 40:(0.7)"
Camera Shakify: INVESTIGATION pattern enabled
Output: "camera movement with moderate panning right (extended), subtle tilting up (sustained), and gentle zooming out (brief)"
VACE (Video Adaptive Conditional Enhancement) models are Wan’s latest all-in-one architecture that handles both Text-to-Video and Image-to-Video generation with a single model, providing superior consistency for I2V chaining:
{
"0": "A serene mountain landscape at dawn",
"30": "Morning mist rising from the valleys",
"60": "Golden sunlight breaking through clouds",
"90": "Full daylight illuminating the peaks"
}
At 30 FPS, this creates exactly 1-second clips with seamless I2V transitions using VACE’s unified architecture.
| Model | Type | Size | VRAM | Speed | Quality | I2V Chaining | Best For | |——-|——|——|——|——–|———|————–|———-| | VACE-1.3B | All-in-one | ~17GB | 8GB+ | Fast | Good | ✅ Perfect | Most Users ⭐ | | VACE-14B | All-in-one | ~75GB | 16GB+ | Slow | Excellent | ✅ Perfect | High-end Systems | | T2V-1.3B | T2V Only | ~17GB | 8GB+ | Fast | Good | ❌ None | Independent Clips | | T2V-14B | T2V Only | ~75GB | 16GB+ | Slow | Excellent | ❌ None | Independent Clips | | I2V-1.3B | I2V Only | ~17GB | 8GB+ | Fast | Good | ✅ Good | Legacy I2V Chaining | | I2V-14B | I2V Only | ~75GB | 16GB+ | Slow | Excellent | ✅ Good | Legacy I2V Chaining |
💡 Recommendation: Use VACE models for I2V chaining workflows, T2V models only for independent clip generation.
For comprehensive documentation, see:
After installation, you can test the setup by generating the default bunny with “Distribution” set to “Keyframes Only” and “Animation Mode” set to “3D”. This also downloads Depth-Anything V2 or the MiDaS model for depth warping when ran for the first time and demonstrates prompt synchronization in a no-cadence setup.
The default bunnies contain 333 frames at 720p, but only 19 of them are actually diffused. The diffused frames are placed in the clip according to the keyframes defined in the prompts. The prompts themselves are aligned to be synchronized at 60 FPS with the beat of an amen break you can find linked in the settings (enable sound):
https://github.com/user-attachments/assets/5f637a04-104f-4d87-8439-15a386685a5e
If you used other versions of the Deforum plugin before, it may also be necessary to update or adjust your Deforum settings. The latest example settings with for the default bunny can also be downloaded here:
https://github.com/Tok/sd-forge-deforum/blob/main/scripts/default_settings.txt
Causes the rendering to run on an experimental core that can rearrange keyframes, which makes it possible to set up fast generations with less jitter at high or no cadence.
All subtitles are now generated and written to an .srt file in advance. Complex subtitle generations should work fine with Parseq but are currently limited with Deforum-only setups.
Add camera shake effects to your renders on top of your other movement.
This feature enhances the realism of your animations by simulating natural camera movements, adding a layer of depth and engagement to your visuals. Perfect for creating action sequences or adding a sense of spontaneity, it allows for customizable shake parameters to fit your specific needs.
The shake data is available under Creative Commons CC0 1.0 Universal license and was sourced from the ‘Camera Shakify’ Blender plugin by EatTheFuture.
webui/models/qwen/During active development, content and structure of the deforum_settings.txt file
can change quickly been updated. Settings from older versions may not behave as expected.
If necessary, the latest deforum-settings.txt are available for download here:
https://github.com/Tok/sd-forge-deforum/blob/main/scripts/default_settings.txt
pip install -r requirements.txt