Alan continuously captures screenshots and performs image recognition. The results of the image recognition are logged for further analysis.
👋 Hey! Watch me code it in this 📺recorded livestream.
- Automated screenshot capture
- Image recognition using Google's Gemini 2.0 Flash model
- Continuous results logging with timestamps
- Built with TypeScript and Bun runtime
- Bun runtime installed
- Gemini API key
- TypeScript 5.x
-
Clone the repository:
git clone https://github.com/gavmor/alan.git cd alan -
Install dependencies:
bun install
-
Set up your Gemini API key:
- Visit Google AI Studio
- Create an API key
- Set the API key in your environment:
export GEMINI_API_KEY='your-api-key-here'
Run the application:
bun run index.tsThe application will:
- Take screenshots of your desktop
- Process the screenshots through Gemini Pro Vision for image recognition
- Append the recognition results with timestamps to results.txt
To run tests:
bun testscreenshot-desktop: For capturing desktop screenshots- (Linux)
image-magick
- (Linux)
ollama: For interfacing with Ollama AI models- TypeScript: For type-safe development