Passionate AI Engineer | Deep Learning Researcher | Backend Developer

Technical Skills: Python, Machine Learning, Deep Learning, TensorFlow, Hugging Face, Flask, Django, SQL, DVC, MlFlow, Pandas, NumPy, Computer Vision, NLP

Education

B.Sc. in Computer Science & Engineering
American International University-Bangladesh (Jan 2020 - Dec 2023)
- Recipient of the Dean’s Award for remarkable academic achievement
H.S.C. in Science
BAF Shaheen College, Dhaka (Jun 2018)

Experience

AI Engineer-TechnyX.ai (Full time On-site) (December 2024 - Present)

Led development of advanced computer vision and multimodal AI solutions, address and solved real-world industry challenges with 98% accuracy.
Architected and deployed vision-language models for in-house API development, optimizing inference cost by 40% and reducing computational costs through efficient model deployment.
Engineered state-of-the-art computer vision algorithms and end-to-end AI applications, integrating LLMs with computer vision systems for building robust solution.
Collaborated with cross-functional teams to design and deploy scalable AI software solutions, ensuring seamless integration of multiple AI technologies across the platform

Freelancer-UpWork (Part-time Remote) (October 2024 - Present)

Developed a state-of-the-art super-resolution model for image upscaling, leveraging advanced techniques to enhance image quality and detail.
Modified the architecture of the ”Dual Attention Transformer” to effectively upscale images from two distinct sources, ensuring optimal performance and quality.
Designed a robust architecture that utilizes channel-wise concatenation of images, focusing on the most efficient features to improve the overall upscaling process.
Conducted extensive testing and validation of the model to ensure high accuracy and reliability in various real-world applications.
Collaborated with clients to understand their specific needs and provided tailored solutions, resulting in a 30% increase in client satisfaction ratings.

Junior AI Developer CRTVAI (Full-time Remote) (April 2024 - September 2024)

Designed and implemented the system architecture and API endpoints of an automated quality control AI application to enhance customer care services.
Fine-tuned speech-to-text models, deploying server-less models to minimize third-party transcription costs.
Improved the Voice Activity Detection pipeline and conducted a detailed analysis of speech metrics to enhance accuracy and efficiency.
Achieved a 25% reduction in GPT API calling costs through strategic optimization.
Implemented Background task processing techniques to significantly enhance system performance and reliability.

Researh Assistant in Computer Vision (Full-time) (Sep 2023 - Jan 2024)

Collaborated with the research team to contribute to Computer Vision research project.
Conducted comprehensive research, experiments, and data analysis to support project objectives.
Research Collaboration: Actively participated in a Computer Vision research project, leveraging my skills and knowledge to contribute to the team’s success.
Research Dataset: Contributed to the creation of a research dataset containing 12,000+ labeled images, supporting future computer vision research and applications.
Research and Data Analysis: Reduced research time by 20% by implementing efficient data analysis and documentation practices.
Documentation & Presentation: Successfully presented research progress and insights to the research group and advisor, ensuring clear communication and project transparency.

Research Experience

[Draft] Research: A Large-Scale Action Dataset

Publications

[Under-Review (Scientific Reports)] Research: High-Accuracy Image Segmentation for Self-Driving Cars
Speech Emotion Recognition using Transfer Learning Approach and Real-Time Evaluation in English and Bengali Language ResearchGate

Projects

PROJECTS

DeepCrawl-Chat: Intelligent Web Crawler and RAG System | GitHub Link

Full-Stack AI Application | [March 2025 - Present]

Developed “DeepCrawl-Chat,” an intelligent system for advanced web crawling, information extraction, and Retrieval Augmented Generation (RAG). This allows users to crawl websites and interactively query the content using AI language models, facilitating tasks such as competitor website analysis and effective LLM-integrated data analysis.
- Advanced, configurable web crawling (depth, concurrency, filters).
- Efficient extraction of text, links, and media.
- High-performance asynchronous crawling and parallel document processing.
- Built an interactive chat interface using LangChain memory, allowing users to have natural language conversations about video content.
- RAG system for asking questions about crawled content.
- Designed for standalone use or flexible integration via a FastAPI-based API..
Tech-Stack: Python, FastAPI, LangChain, FAISS, Docker, Uvicorn, NVIDIA AI, SQLAlchemy, Redis, Groq API.

YouTube Video Summarizer | GitHub Link

Full-Stack AI Application | [April 2025 - Present]

Developed a Python application that automates YouTube video transcription and summarization using LLM models, reducing content review time by up to 90%.
Architected a scalable solution with FastAPI backend, Streamlit frontend, and SQLAlchemy ORM with support for both SQLite and PostgreSQL databases.
Implemented semantic search capabilities using FAISS vector database, enabling users to query across multiple video transcripts simultaneously.
Integrated Groq’s Whisper API for accurate speech-to-text transcription and LLM models for generating concise, high-quality summaries of video content.
Built an interactive chat interface using LangChain memory, allowing users to have natural language conversations about video content.
Created a robust caching system with Redis (with in-memory fallback) to optimize performance and minimize API usage costs.
Designed multiple access interfaces including a web UI, RESTful API, and command-line interface for maximum flexibility.
Tech-Stack: FastAPI, Streamlit, LangChain, FAISS, SQLAlchemy, Redis, Groq API, Nvidia-Infarence API ,PyTube

(Chat Bot) MediChat-Assistant | GitHub Link

Developed a medical chatbot application leveraging the open-source LLM Llama-2 7B-chat with Retrieval-Augmented Generation (RAG) for accurate, source-based responses.
- Accurate medical assistance: Utilized RAG to provide precise, context-aware responses by retrieving relevant information from trusted medical sources.
- User-friendly interaction: Built a seamless interface using FastAPI, enabling easy access to the chatbot via RESTful APIs.
- Enhanced response quality: Integrated Langchain for efficient prompt engineering and improved conversational flow.
- Scalable and modular: Designed with a modular architecture, allowing for easy integration of additional features or datasets.
Tech-Stack:: LLM (Llama-2 7B-chat), RAG, VectorDB, Python, FastAPI, Langchain
Skills: Natural Language Processing, API Development, Prompt Engineering, Vector Database Integration

(Python Package) SpectraClassify | GitHub Link

Built a user-friendly web application for training custom image classification models without writing code.
- Effortlessly create custom models: Train models on their own data with a simple interface, eliminating the need for coding expertise.
- Experiment with diverse models: Select from a variety of pre-trained models or fine-tune existing ones for specific tasks.
- Seamless testing and inference: Upload images or utilize webcam input to test the accuracy and effectiveness of their trained models.
- Published as a PyPI package: Enabling broader accessibility and facilitating quick image classification model training with minimal code for developers.
Tech-Stack:: Python, TensorFlow, JS, HTML, Bash
Skills: Deep Learning, Web Development, CICD pipeline.

Kidney Tumors and Stones Classification | GitHub Link

A deep learning project aimed at accurately classifying kidney tumors and stones from medical images.
- The project is designed to integrate state-of-the-art machine learning techniques with robust data management and collaborative tools, providing a significant contribution to medical imaging analysis.
Tech-Stack:: Python · TensorFlow · DVC · MLOps.
Skills: Deep Learning, Web-Development, Version Control.

Community Platform for AIUB Students - AIG | GitHub Link

Developed a comprehensive web application (ASP.NET Web API) using C# and Entity Framework to foster the AIUB student community. Adhering to SOLID principles and a robust 3-tier architecture, the platform empowers students with:
- Engaging communication: Features for students to connect and share information.
- Streamlined resume building: Efficient tools for creating resumes for the users.
- Job opportunities: Centralized platform for posting and applying to job openings with one-click ease.
- Advanced access control: Admin control over post and job moderation, user management, and security.
Responsibilities included: Full database design, Use-case analysis, Authentication & authorization, Development of features and API endpoints
Tech-Stack:: ASP.NET, C#, Entity-Framework, Git
Skills: Software Development, Version-controll

Emotion Classification from Face Images in Real-Time | GitHub Link

Developed a real-time emotion classification system using a custom deep learning model and Flask Web API. This project enables users to:
- Classify emotions in real-time: Leverage their webcam to capture facial expressions and receive instant emotion classification through a user-friendly web interface.
- High-accuracy classification: The custom deep learning model delivers accurate emotion identification, fostering potential applications in various fields.
Tech: Python,TensorFlow, OpenCV, Flask, JavaScript, HTML, CSS.
Skill: Deep Learning, Image processing, Web-development.

On-the-Go-Podcast | GitHub Link

Developed a podcast app using PHP to empower podcast enthusiasts. This platform offers:
- Streamlined organization: Efficient tools for managing playlists, and listening progress.
- Seamless podcast experience: Effortlessly discover, subscribe to, and enjoy favorite podcasts, all within a user-friendly interface.
Tech-Stack:: PHP, JavaScript, Bootstrap, CSS, HTML, Git
Skills: Web development

Fashion Recommendation System Using ResNet-50 | GitHub Link

An end-to-end fashion recomendation system uisng deep learning and machine learning.
- Feature extraction using Resnet50
- Similarity matching using k nearest neighbours (KNN)
Tech-Stack:: Python, Numpy, Tensorflow, Scikit-learn, Git
Skills: Machine learning

Chrome Dingo Clone Build with OpenGL | GitHub Link

A 2D Chrome Dino-like game using OpenGL(freeglut) and C++
Tech-Stack:: C++, OpenGL
Skills: Computer Graphics, Game logic

Hospital Management System | GitHub Link

Developed a comprehensive hospital management system to streamline operations and enhance efficiency in multi-department healthcare facilities. This desktop application features:
- Multi-user access control: Granular role-based permissions for doctors, nurses, administrators, and other staff members.
- Comprehensive department management: Dedicated modules for various departments like admissions, billing, pharmacy, and patient records.
- User-friendly interface: Intuitive design for seamless navigation and efficient data management.
Tech-Stack:: C#, C#-Form , MS-sql, Git
Skill: Desktop application, Database design.
Honors and Awards
Dean’s List Award - for outstanding academic performance in undergrad.

Contract

Reach me at email: sadhin.aiub.cse@gmail.com
Follow me on Linkedin