About me: I am a deep learning researcher, primarily interested in computer vision. My current research interests include generative AI and applications of deep learning in bio-medicine.
Over the years, I have leveraged AI to solve a broad set of problems spanning multiple domains like vision, natural language and speech. On the engineering side of things, I have worked at multiple startups to build their AI products and data pipelines.
I occasionally participate in Kaggle competitions and am currently ranked a Competitions Expert. I also (try to) maintain a blog where I share my insights on various AI-related topics. You can check it out here.
Papers I have published
Paul S. Scotti*, Atmadeep Banerjee,* Jimmie Goode, Stepan Shabalin, Alex Nguyen, Ethan Cohen, Aidan J. Dempster, Nathalie Verlinde, Elad Yundler, David Weisberg, Kenneth A. Norman, Tanishq Mathew Abraham
NeurIPS 2023, Spotlight
[arxiv] [paper] [website]
Zudi Lin, Prateek Garg, Atmadeep Banerjee, Salma Abdel Magid, Deqing Sun, Yulun Zhang, L. Gool, D. Wei, H. Pfister
[arxiv]
Siddhant Kharbanda, Atmadeep Banerjee, Akash Palrecha, Devaansh Gupta, Rohit Babbar
SIGIR 2023
[arxiv]
Atmadeep Banerjee
CODS COMAD 2021: 8th ACM IKDD CODS and 26th COMAD
[paper]
Things I have formally worked on and positions I have held
INDUSTRY
Working on projects related to neuro AI.
My work on fMRI-to-image reconstruction was published in Neurips 2023. Also worked on the Algonauts 2023 challenge for predicting fMRI responses from a given image. Currently working on EEG-to-image reconstruction.
Morphle is a biomedical robotics company that builds automated microscopes.
My work was centered around computer vision for pathology. I built two sets of AI models that integrate with the microscope. The first set enables the robot to optimally capture critical areas of a slide. This is achieved by rapidly taking high FoV images at low resolution and detecting important areas using deep learning models. This is followed by optimal path calculation using algorithms like DFS and rescanning these important areas at high resolution. The second set of models works on top of high resolution images to detect cells and other inclusions. Some examples include detection of malarial parasites, blast cells (leukemia) and brain tumours.
VisualOne was a Computer Vision startup building a few-shot learning framework, specifically for event detection using security cameras.
I worked on increasing the robustness of object detection. The final approach consisted of 3 modifications to the training step:
Pixxel is a Remote Sensing startup working towards building a health monitor for the planet. It aims to launch a constellation of nanosatellites for real time satellite imagery and analytics. After Pixxel became a startup I was formally employed as an intern, working alongside my coursework.
Example of synthetic multispectral imagery. The model takes source information consisting of a radar (SAR) image, a multispectral image and a timestamp and generates a multispectral image of the same area for a query timestamp. Using multiple query timestamps for the same source, allows one to visualize the effect of change of seasons on a piece of land.
ACADEMIA
My research was in the domain of extreme classification — classification problems with millions of labels.
My research led to 3 publications:
Worked under the guidance of Prof. Hanspeter Pfister. Worked on Instance Segmentation of natural images using metric learning, Connectomic Segmentation from 3D electron microscope imagery, and Single Image Super Resolution.
Example of a 3D neuron mesh segmented from EM using my trained model (flood-filling network). Left is GT and right is Pred. The model is unable to segment the smaller dendrons automatically. Fortunately this is a human-in-the-loop model which allows errors to be fixed with human intervention.
AT UNIVERSITY
Before it was a startup, Pixxel started off as a student team in BITS Pilani. I was a member of the Pixxel AI Team during its inception, and a few months later became the AI lead. During my time at Pixxel I worked on computer vision models for extracting information from satellite imagery. The primary focus of my work was to build a high accuracy model for extracting Indian road networks as graphs from satellite imagery. I also worked on models for crop yield prediction and building segmentation.
I worked on a project for detecting and classifying various kinds of Road Signs appearing on Indian Roads(advertisements, traffic signs, etc.). I trained various single-shot and region based object detection algorithms on a dataset provided by MapMyIndia with around 16,000 annotated images across 30 classes and compared their speed and accuracy. For the final submission I trained a network based on YOLO-v3 that achieved a mAP score of 89.71 and F1-score of 0.94
I was a member of the Game Development and ML Teams of BITS Pilani Coding Club from August 2017 till my graduation in 2021. I was the club's Joint Coordinator for Apogee, BITS Pilani's Tech Fest, between from 2019 to 2020.
As the Joint Coordinator for Apogee 2020, I co-led the creation of a gamified (Pokemon Go like) AR app for crowd control during the fest.
Prior to this, as a part of the club's game development team, I was responsible for designing and building original games to be played by students attending the various fests at BITS. I have worked on Kinect and VR based games.
Apart from this, I helped organize various workshops and hackathons. I helped organize a workshop for teaching 3D modelling and game programming. As a member of the ML team, I organised a Machine Learning hackathon, sponsored by Yes Bank, during Apogee 2019.
Things I have independently worked on
My submission for the UW-Madison GI Tract Image Segmentation Kaggle competition. The final submission is an ensemble of 3 different types models with test time augmentation for each.
My submission won a silver medal
LinkThis project explores Q-Learning to train an agent for the game DotsNBoxes. Several agents were trained using settings like random opponent, heuristic opponent and adversarial self-play. Our observations on agent performance is compiled in a report in the link given
The code is written from scratch in Java.
This model was made for the Kaggle APTOS 2019 Blindness Detection competition. I trained a CNN model to detect the occurence of diabetic retinopathy from fundus photography. The model outputs an integer between 0 to 4 with 0 indicating no DR and 4 indicating proliferative DR.
My submission won a silver medal in the competition.
LinkI trained a GAN (Wasserstein Gan with Gradient Penalty) to generate new images of (fire) Pokemon. For training, I scraped images from DuckDuckGo Image Search to make a custom dataset. 2,109 images were downloaded and augmented to make a dataset of 37,962 images. Head over to the GitHub link for more information.
LinkSee More
MetaAI
This is a deep learning library specialised for meta-learning. It is built on top of fastai v1 and Pytorch.
Most Deep Learning libraries depend on code written in a language more performant than Python, and this code is not easily accessible by users. A new learner is not able to see how DL algorithms are implemented, essentially turning these algorithms into black boxes. So I built a fully functional CNN library from scratch using only Numpy. The code is designed to be simple, easy to read and, easily extendable by the user.
LinkThis project enables users to stream tweets and news articles in real time depending on a search query. The streamed text corpora are tokenized, stemmed and then vectorized using word2vec embedding. The vectorized sentences are then read by a CNN model which outputs the mean sentiment associated with the text corpus on a scale of 0 to 1, with 1 being highly positive and 0 being highly negative.
The aim of this project is to enable users to assess the public sentiment related to a topic or organization at any point of time.
LinkThis project allows users to simulate a Firewall using the logic programming language Prolog. Users can create a rule set dictating what kinds of packets to allow through the firewall. Different rules can be set for adapter and ethernet clauses. The system determines if an incoming packet is to be allowed or blocked. The Github link has a more detailed explanation.
The project was completed in partial fulfilment of the course Logic in Computer Science in BITS Pilani.
LinkDon't want to go through the whole website? See just the important stuff. Downloadable as a pdf
Get in touch with me