Arnav Chavan

Arnav Chavan

Mumbai, Maharashtra, India
11K followers 500+ connections

About

https://sites.google.com/view/arnavchavan/

Activity

Join now to see all activity

Experience

  • Nyun AI Graphic
  • -

  • -

  • -

    Abu Dhabi, Abu Dhabi Emirate, United Arab Emirates

  • -

  • -

  • -

    Bengaluru, Karnataka, India

  • -

  • -

    Abu Dhabi, United Arab Emirates

  • -

  • -

  • -

    Germany

  • -

  • -

  • -

  • -

Education

Licenses & Certifications

Publications

  • Dynamic Kernel Selection for Improved Generalization and Memory Efficiency in Meta-learning

    CVPR 2022

    Gradient based meta-learning methods are prone to overfit on the meta-training set, and this behaviour is more prominent with large and complex networks. Moreover, large networks restrict the application of meta-learning models on low-power edge devices. While choosing smaller networks avoid these issues to a certain extent, it affects the overall generalization leading to reduced performance. Clearly, there is an approximately optimal choice of network architecture that is best suited for…

    Gradient based meta-learning methods are prone to overfit on the meta-training set, and this behaviour is more prominent with large and complex networks. Moreover, large networks restrict the application of meta-learning models on low-power edge devices. While choosing smaller networks avoid these issues to a certain extent, it affects the overall generalization leading to reduced performance. Clearly, there is an approximately optimal choice of network architecture that is best suited for every meta-learning problem, however, identifying it beforehand is not straightforward. In this paper, we present MetaDOCK, a task-specific dynamic kernel selection strategy for designing compressed CNN models that generalize well on unseen tasks in meta-learning. Our method is based on the hypothesis that for a given set of similar tasks, not all kernels of the network are needed by each individual task. Rather, each task uses only a fraction of the kernels, and the selection of the kernels per task can be learnt dynamically as a part of the inner update steps. MetaDOCK compresses the meta-model as well as the task-specific inner models, thus providing significant reduction in model size for each task, and through constraining the number of active kernels for every task, it implicitly mitigates the issue of meta-overfitting. We show that for the same inference budget, pruned versions of large CNN models obtained using our approach consistently outperform the conventional choices of CNN models. MetaDOCK couples well with popular meta-learning approaches such as iMAML. The efficacy of our method is validated on CIFAR-fs and mini-ImageNet datasets.

    See publication
  • Vision transformer slimming: Multi-dimension searching in continuous optimization space

    CVPR 2022

    This paper explores the feasibility of finding an optimal sub-model from a vision transformer and introduces a pure vision transformer slimming (ViT-Slim) framework. It can search a sub-structure from the original model end-to-end across multiple dimensions, including the input tokens, MHSA and MLP modules with state-of-the-art performance. Our method is based on a learnable and unified l1 sparsity constraint with pre-defined factors to reflect the global importance in the continuous searching…

    This paper explores the feasibility of finding an optimal sub-model from a vision transformer and introduces a pure vision transformer slimming (ViT-Slim) framework. It can search a sub-structure from the original model end-to-end across multiple dimensions, including the input tokens, MHSA and MLP modules with state-of-the-art performance. Our method is based on a learnable and unified l1 sparsity constraint with pre-defined factors to reflect the global importance in the continuous searching space of different dimensions. The searching process is highly efficient through a single-shot training scheme. For instance, on DeiT-S, ViT-Slim only takes 43 GPU hours for the searching process, and the searched structure is flexible with diverse dimensionalities in different modules. Then, a budget threshold is employed according to the requirements of accuracy-FLOPs trade-off on running devices, and a re-training process is performed to obtain the final model. The extensive experiments show that our ViT-Slim can compress up to 40% of parameters and 40% FLOPs on various vision transformers while increasing the accuracy by 0.6% on ImageNet. We also demonstrate the advantage of our searched models on several downstream datasets. Our code is available at https://github. com/Arnav0400/ViT-Slim.

    See publication
  • ChipNet: Budget-Aware Pruning with Heaviside Continuous Approximations

    ICLR 2021

    Structured pruning methods are among the effective strategies for extracting small resource-efficient convolutional neural networks from their dense counterparts with minimal loss in accuracy. However, most existing methods still suffer from one or more limitations, that include 1) the need for training the dense model from scratch with pruning-related parameters embedded in the architecture, 2) requiring model-specific hyperparameter settings, 3) inability to include budget-related constraint…

    Structured pruning methods are among the effective strategies for extracting small resource-efficient convolutional neural networks from their dense counterparts with minimal loss in accuracy. However, most existing methods still suffer from one or more limitations, that include 1) the need for training the dense model from scratch with pruning-related parameters embedded in the architecture, 2) requiring model-specific hyperparameter settings, 3) inability to include budget-related constraint in the training process, and 4) instability under scenarios of extreme pruning. In this paper, we present ChipNet, a deterministic pruning strategy that employs continuous Heaviside function and a novel crispness loss to identify a highly sparse network out of an existing dense network.

    Other authors
    See publication
  • Rescaling CNN through Learnable Repetition of Network Parameters

    IEEE International Conference on Image Processing (ICIP)

    Deeper and wider CNNs are known to provide improved performance for deep learning tasks. However, most such networks have poor performance gain per parameter increase. In this paper, we investigate whether the gain observed in deeper models is purely due to the addition of more optimization parameters or whether the physical size of the network as well plays a role. Further, we present a novel rescaling strategy for CNNs based on learnable repetition of its parameters. Based on this strategy…

    Deeper and wider CNNs are known to provide improved performance for deep learning tasks. However, most such networks have poor performance gain per parameter increase. In this paper, we investigate whether the gain observed in deeper models is purely due to the addition of more optimization parameters or whether the physical size of the network as well plays a role. Further, we present a novel rescaling strategy for CNNs based on learnable repetition of its parameters. Based on this strategy, we rescale CNNs without changing their parameter count, and show that learnable sharing of weights itself can provide significant boost in the performance of any given model without changing its parameter count. We show that small base networks when rescaled, can provide performance comparable to deeper networks with as low as 6% of optimization parameters of the deeper one. The relevance of weight sharing is further highlighted through the example of group-equivariant CNNs. We show that the significant improvements obtained with group-equivariant CNNs over the regular CNNs on classification problems are only partly due to the added equivariance property, and part of it comes from the learnable repetition of network weights. For rot-MNIST dataset, we show that up to 40% of the relative gain reported by state-of-the-art methods for rotation equivariance could actually be due to just the learnt repetition of weights.

    Other authors
    See publication
  • A new method for quantification of retinal blood vessel characteristics

    SPIE - Ophthalmic Technologies XXXI

    Uniform and quantitative grading of retinal vessel characteristics are replacing subjective and qualitative schemes. However clinically accurate blood vessel extraction is very important. The tortuosity of these vessels is an important metric to study the curvature variations in normal and diseased eyes. In this study we provide a new unsupervised and fully automated approach for studying curvature variation of the blood vessels. We then pro- vide tortuosity quantification of these extracted…

    Uniform and quantitative grading of retinal vessel characteristics are replacing subjective and qualitative schemes. However clinically accurate blood vessel extraction is very important. The tortuosity of these vessels is an important metric to study the curvature variations in normal and diseased eyes. In this study we provide a new unsupervised and fully automated approach for studying curvature variation of the blood vessels. We then pro- vide tortuosity quantification of these extracted vessels. In this study we used optical coherence tomography angiographic fundus images of dimensions 420x420 pixels corresponding to 6mm x 6mm were used in this study. We focused on the central circular 210x210 pixel region around the foveal avascular zone (FAZ) for tortuosity quantification. Our segmentation approach starts with a 3mm x 3mm central circular region extraction surrounding the FAZ. We then use a multi-scale, multi-span line detection filter to smoothen out the high noise in the background and at the same time increase the intensity of target vessels. This is followed by a K-means procedure to filter out the noise and target vessels into two categories. Next steps are morphological closing and noise removal and iterative erosion of pixels to skeletonize the vessels. The final extracted vessels are of the form of single pixel piecewise continuous fragments. These are finer than human annotations and at the same time free of noise. We then provide accurate standard tortuosity measures - Distance Measure, Inflection Points, Turning Points, etc. for these OCTA images using the extracted vessels through mathematical modelling.

    Other authors
    See publication
  • Is there a relationship between retinal blood vessel characteristics and ametropia?

    SPIE - Ophthalmic Technologies XXXI

    Retinal vasculature is affected in many ocular conditions including diabetic retinopathy, glaucoma and age-related macular degeneration and these alterations can be used as biomarkers. Therefore, it is important to segment and quantify retinal blood vessel characteristics (RBVC) accurately. Using a new automated image processing method applied to optical coherence tomography angiography (OCTA) images we computed the RBVC and compared them between emmetropic (n=40) and ametropic (n=97) subjects.…

    Retinal vasculature is affected in many ocular conditions including diabetic retinopathy, glaucoma and age-related macular degeneration and these alterations can be used as biomarkers. Therefore, it is important to segment and quantify retinal blood vessel characteristics (RBVC) accurately. Using a new automated image processing method applied to optical coherence tomography angiography (OCTA) images we computed the RBVC and compared them between emmetropic (n=40) and ametropic (n=97) subjects. All 137 OCTA images had dimensions of 420x420 pixels corresponding to 6mm x 6mm. The myopia OCTA images were labelled based on a severity scale as mild, moderate, high and very high using standard refractive error classifications. Before image processing, all the images were cropped to 210 X 210 pixels keeping the foveal avascular zone (FAZ) at the centre to quantify the RBVC. The mean ± standard deviation of the Grisan index, a measure of retinal blood vessel tortuosity in the emmetropic, and myopic eye were 0.05 ± 0.02 and 0.05 ± 0.03 respectively. The total vessel distance measures were calculated and the largest were found in emmetropic eyes (45.95 ± 19.54) and shortest in myopic eyes (6.50 ± 5.17). The total number of turning points and inflection points were found to be statistically significant (p<0.05) between control and myopic eyes. However, other RBVC parameters were not statistically different (p=<0.05). We found qualitatively that RBVC changes with increasing severity of the refractive power. Among RBVC parameters, average number of turning points (NTP) decreasing trend with degree of myopia increases.

    Other authors
    See publication
  • Deep learning for detection and segmentation of artefact and disease instances in gastrointestinal endoscopy

    Medical Image Analysis

    The Endoscopy Computer Vision Challenge (EndoCV) is a crowd-sourcing initiative to address eminent problems in developing reliable computer aided detection and diagnosis endoscopy systems and suggest a pathway for clinical translation of technologies. Whilst endoscopy is a widely used diagnostic and treatment tool for hollow-organs, there are several core challenges often faced by endoscopists, mainly: 1) presence of multi-class artefacts that hinder their visual interpretation, and 2)…

    The Endoscopy Computer Vision Challenge (EndoCV) is a crowd-sourcing initiative to address eminent problems in developing reliable computer aided detection and diagnosis endoscopy systems and suggest a pathway for clinical translation of technologies. Whilst endoscopy is a widely used diagnostic and treatment tool for hollow-organs, there are several core challenges often faced by endoscopists, mainly: 1) presence of multi-class artefacts that hinder their visual interpretation, and 2) difficulty in identifying subtle precancerous precursors and cancer abnormalities. Artefacts often affect the robustness of deep learning methods applied to the gastrointestinal tract organs as they can be confused with tissue of interest. EndoCV2020 challenges are designed to address research questions in these remits. In this paper, we present a summary of methods developed by the top 17 teams and provide an objective comparison of state-of-the-art methods and methods designed by the participants for two sub-challenges: i) artefact detection and segmentation (EAD2020), and ii) disease detection and segmentation (EDD2020). Multi-center, multi-organ, multi-class, and multi-modal clinical endoscopy datasets were compiled for both EAD2020 and EDD2020 sub-challenges. The out-of-sample generalization ability of detection algorithms was also evaluated. Whilst most teams focused on accuracy improvements, only a few methods hold credibility for clinical usability. The best performing teams provided solutions to tackle class imbalance, and variabilities in size, origin, modality and occurrences by exploring data augmentation, data fusion, and optimal class thresholding techniques.

    See publication
  • Multi-Plateau Ensemble for Endoscopic Artefact Segmentation and Detection

    EndoCV2020@ISBI 2020

    Endoscopic artefact detection challenge consists of 1) Artefact detection, 2) Semantic segmentation, and 3) Out-of-sample generalisation. For Semantic segmentation task, we propose a multi-plateau ensemble of FPN (Feature Pyramid Network) with EfficientNet as feature extractor/encoder. For Object detection task, we used a three model ensemble of RetinaNet with Resnet50 Backbone and FasterRCNN (FPN + DC5) with Resnext101 Backbone. A PyTorch implementation to our approach
    to the problem is…

    Endoscopic artefact detection challenge consists of 1) Artefact detection, 2) Semantic segmentation, and 3) Out-of-sample generalisation. For Semantic segmentation task, we propose a multi-plateau ensemble of FPN (Feature Pyramid Network) with EfficientNet as feature extractor/encoder. For Object detection task, we used a three model ensemble of RetinaNet with Resnet50 Backbone and FasterRCNN (FPN + DC5) with Resnext101 Backbone. A PyTorch implementation to our approach
    to the problem is available at github.com/ubamba98/EAD2020.

    Other authors
    See publication
  • Transfer Learning Gaussian Anomaly Detection by Fine-Tuning Representations

    Pre-print under Review

    Current state-of-the-art Anomaly Detection (AD) methods exploit the powerful representations yielded by large-scale ImageNet training. However, catastrophic forgetting prevents the successful fine-tuning of pre-trained representations on new datasets in the semi/unsupervised setting, and representations are therefore commonly fixed.
    In our work, we propose a new method to fine-tune learned representations for AD in a transfer learning setting. Based on the linkage between generative and…

    Current state-of-the-art Anomaly Detection (AD) methods exploit the powerful representations yielded by large-scale ImageNet training. However, catastrophic forgetting prevents the successful fine-tuning of pre-trained representations on new datasets in the semi/unsupervised setting, and representations are therefore commonly fixed.
    In our work, we propose a new method to fine-tune learned representations for AD in a transfer learning setting. Based on the linkage between generative and discriminative modeling, we induce a multivariate Gaussian distribution for the normal class, and use the Mahalanobis distance of normal images to the distribution as training objective. We additionally propose to use augmentations commonly employed for vicinal risk minimization in a validation scheme to detect onset of catastrophic forgetting.
    Extensive evaluations on the public MVTec AD dataset reveal that a new state of the art is achieved by our method in the AD task while simultaneously achieving AS performance comparable to prior state of the art. Further, ablation studies demonstrate the importance of the induced Gaussian distribution as well as the robustness of the proposed fine-tuning scheme with respect to the choice of augmentations.

    See publication

Honors & Awards

  • KVPY Scholar

    Indian Institute of Science (IISC) Bangalore

More activity by Arnav

View Arnav’s full profile

  • See who you know in common
  • Get introduced
  • Contact Arnav directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Arnav Chavan in India

Add new skills with these courses