“Milan has recently completed The Learning to Lead Programme at Sussex and has proved to be a dedicated and focussed learner. He has continually been a key contributor to discussions and group exercises, with a keen eye on self development. His capacity for assimilating new learning in to his practices is strong and I always found him to be very professional and enjoyable to work with. I wish him well in his future and feel he will make a worthy addition to any team or environment that he joins moving forward.”
About
A mixture of Science, Engineering, Business and Leadership skills, I prefer companies…
Activity
-
Google DeepMind and Massachusetts Institute of Technology present UniFluid, an autoregressive framework that unifies visual generation and…
Google DeepMind and Massachusetts Institute of Technology present UniFluid, an autoregressive framework that unifies visual generation and…
Liked by Milan Gritta
-
2025 % of all code written by AI = 50% 2026 % of all code written by AI = 100% This was Anthropic CEO's prediction I say Accurate Prediction 2025 %…
2025 % of all code written by AI = 50% 2026 % of all code written by AI = 100% This was Anthropic CEO's prediction I say Accurate Prediction 2025 %…
Liked by Milan Gritta
-
✂️ AutoAbliteration I made a Colab notebook to automatically abliterate models in minutes. It's quite general, so you can do interesting stuff like…
✂️ AutoAbliteration I made a Colab notebook to automatically abliterate models in minutes. It's quite general, so you can do interesting stuff like…
Liked by Milan Gritta
Experience
Education
-
University of Cambridge
-
Activities and Societies: Data Science, Natural Language Processing, AI, Linguistics, Applied Machine Learning, Leadership, Philosophy, Human Behaviour Sciences
Panda Alert - Bioinformatics, NLP, Deep Machine Learning Research, Information Extraction
-
-
Activities and Societies: Cambridge University Entrepreneurs, 80000 hours, President of Fitzwilliam College Entrepreneurs Society, Cambridge University Technology and Enterprise Club
-
-
Activities and Societies: Learning to Lead Scheme, Lead a team of 6 to win the software engineering project ahead of almost 20 other student teams.
Publications
-
Multi3WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for Training and Evaluating Culturally Adapted Task-Oriented Dialog Systems
Transactions of the Association for Computational Linguistics
Creating high-quality annotated data for task-oriented dialog (ToD) is known to be notoriously difficult, and the challenges are amplified when the goal is to create equitable, culturally adapted, and large-scale ToD datasets for multiple languages. Therefore, the current datasets are still very scarce and suffer from limitations such as translation-based non-native dialogs with translation artefacts, small scale, or lack of cultural adaptation, among others. In this work, we first take stock…
Creating high-quality annotated data for task-oriented dialog (ToD) is known to be notoriously difficult, and the challenges are amplified when the goal is to create equitable, culturally adapted, and large-scale ToD datasets for multiple languages. Therefore, the current datasets are still very scarce and suffer from limitations such as translation-based non-native dialogs with translation artefacts, small scale, or lack of cultural adaptation, among others. In this work, we first take stock of the current landscape of multilingual ToD datasets, offering a systematic overview of their properties and limitations. Aiming to reduce all the detected limitations, we then introduce Multi3WOZ, a novel multilingual, multi-domain, multi-parallel ToD dataset. It is large-scale and offers culturally adapted dialogs in 4 languages to enable training and evaluation of multilingual and cross-lingual ToD systems. We describe a complex bottom–up data collection process that yielded the final dataset, and offer the first sets of baseline scores across different ToD-related tasks for future reference, also highlighting its challenging nature.
-
A Systematic Study of Performance Disparities in Multilingual Task-Oriented Dialogue Systems
Achieving robust language technologies that can perform well across the world's many languages is a central goal of multilingual NLP. In this work, we take stock of and empirically analyse task performance disparities that exist between multilingual task-oriented dialogue (ToD) systems. We first define new quantitative measures of absolute and relative equivalence in system performance, capturing disparities across languages and within individual languages. Through a series of controlled…
Achieving robust language technologies that can perform well across the world's many languages is a central goal of multilingual NLP. In this work, we take stock of and empirically analyse task performance disparities that exist between multilingual task-oriented dialogue (ToD) systems. We first define new quantitative measures of absolute and relative equivalence in system performance, capturing disparities across languages and within individual languages. Through a series of controlled experiments, we demonstrate that performance disparities depend on a number of factors: the nature of the ToD task at hand, the underlying pretrained language model, the target language, and the amount of ToD annotated data. We empirically prove the existence of the adaptation and intrinsic biases in current ToD systems: e.g., ToD systems trained for Arabic or Turkish using annotated ToD data fully parallel to English ToD data still exhibit diminished ToD task performance. Beyond providing a series of insights into the performance disparities of ToD systems in different languages, our analyses offer practical tips on how to approach ToD data collection and system development for new languages.
-
Pangu-coder: Program synthesis with function-level language modeling
arXiv preprint arXiv:2207.11280
We present PanGu-Coder, a pretrained decoder-only language model adopting the PanGu-Alpha architecture for text-to-code generation, i.e. the synthesis of programming language solutions given a natural language problem description. We train PanGu-Coder using a two-stage strategy: the first stage employs Causal Language Modelling (CLM) to pre-train on raw programming language data, while the second stage uses a combination of Causal Language Modelling and Masked Language Modelling (MLM) training…
We present PanGu-Coder, a pretrained decoder-only language model adopting the PanGu-Alpha architecture for text-to-code generation, i.e. the synthesis of programming language solutions given a natural language problem description. We train PanGu-Coder using a two-stage strategy: the first stage employs Causal Language Modelling (CLM) to pre-train on raw programming language data, while the second stage uses a combination of Causal Language Modelling and Masked Language Modelling (MLM) training objectives that focus on the downstream task of text-to-code generation and train on loosely curated pairs of natural language program definitions and code functions. Finally, we discuss PanGu-Coder-FT, which is fine-tuned on a combination of competitive programming problems and code with continuous integration tests. We evaluate PanGu-Coder with a focus on whether it generates functionally correct programs and demonstrate that it achieves equivalent or better performance than similarly sized models, such as CodeX, while attending a smaller context window and training on less data.
-
CrossAligner & Co: Zero-Shot Transfer Methods for Task-Oriented Cross-lingual Natural Language Understanding
Findings of the Association for Computational Linguistics: ACL 2022
Task-oriented personal assistants enable people to interact with a host of devices and services using natural language. One of the challenges of making neural dialogue systems available to more users is the lack of training data for all but a few languages. Zero-shot methods try to solve this issue by acquiring task knowledge in a high-resource language such as English with the aim of transferring it to the low-resource language(s). To this end, we introduce CrossAligner, the principal method…
Task-oriented personal assistants enable people to interact with a host of devices and services using natural language. One of the challenges of making neural dialogue systems available to more users is the lack of training data for all but a few languages. Zero-shot methods try to solve this issue by acquiring task knowledge in a high-resource language such as English with the aim of transferring it to the low-resource language(s). To this end, we introduce CrossAligner, the principal method of a variety of effective approaches for zero-shot cross-lingual transfer based on learning alignment from unlabelled parallel data. We present a quantitative analysis of individual methods as well as their weighted combinations, several of which exceed state-of-the-art (SOTA) scores as evaluated across nine languages, fifteen test sets and three benchmark multilingual datasets. A detailed qualitative error analysis of the best methods shows that our fine-tuned language models can zero-shot transfer the task knowledge better than anticipated.
-
XeroAlign: Zero-Shot Cross-lingual Transformer Alignment
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
The introduction of pretrained cross-lingual language models brought decisive improvements to multilingual NLP tasks. However, the lack of labelled task data necessitates a variety of methods aiming to close the gap to high-resource languages. Zero-shot methods in particular, often use translated task data as a training signal to bridge the performance gap between the source and target language(s). We introduce XeroAlign, a simple method for task-specific alignment of cross-lingual pretrained…
The introduction of pretrained cross-lingual language models brought decisive improvements to multilingual NLP tasks. However, the lack of labelled task data necessitates a variety of methods aiming to close the gap to high-resource languages. Zero-shot methods in particular, often use translated task data as a training signal to bridge the performance gap between the source and target language(s). We introduce XeroAlign, a simple method for task-specific alignment of cross-lingual pretrained transformers such as XLM-R. XeroAlign uses translated task data to encourage the model to generate similar sentence embeddings for different languages. The XeroAligned XLM-R, called XLM-RA, shows strong improvements over the baseline models to achieve state-of-the-art zero-shot results on three multilingual natural language understanding tasks. XLM-RA's text classification accuracy exceeds that of XLM-R trained with labelled data and performs on par with state-of-the-art models on a cross-lingual adversarial paraphrasing task.
-
Conversation Graph: Data Augmentation, Training and Evaluation for Non-Deterministic Dialogue Management
Transactions of the Association of Computational Linguistics
Task-oriented dialogue systems typically rely on large amounts of high-quality training data or require complex handcrafted rules. However, existing datasets are often limited in size con- sidering the complexity of the dialogues. Additionally, conventional training signal in- ference is not suitable for non-deterministic agent behavior, namely, considering multiple actions as valid in identical dialogue states. We propose the Conversation Graph (ConvGraph), a graph-based representation of…
Task-oriented dialogue systems typically rely on large amounts of high-quality training data or require complex handcrafted rules. However, existing datasets are often limited in size con- sidering the complexity of the dialogues. Additionally, conventional training signal in- ference is not suitable for non-deterministic agent behavior, namely, considering multiple actions as valid in identical dialogue states. We propose the Conversation Graph (ConvGraph), a graph-based representation of dialogues that can be exploited for data augmentation, multi- reference training and evaluation of non- deterministic agents. ConvGraph generates novel dialogue paths to augment data volume and diversity. Intrinsic and extrinsic evaluation across three datasets shows that data augmentation and/or multi-reference training with ConvGraph can improve dialogue success rates by up to 6.4%.
-
A Pragmatic Guide to Geoparsing Evaluation
Under review @ Springer (Language Resources and Evaluation)
The manuscript introduces such framework in three parts. Part 1) Task Definition: clarified via corpus linguistic analysis proposing a fine-grained Pragmatic Taxonomy of Toponyms with new guidelines. Part 2) Evaluation Data: shared via a dataset called GeoWebNews to provide test/train data to enable immediate use of our contributions. In addition to fine-grained Geotagging and Toponym Resolution (Geocoding), this dataset is also suitable for prototyping machine learning NLP models. Part 3)…
The manuscript introduces such framework in three parts. Part 1) Task Definition: clarified via corpus linguistic analysis proposing a fine-grained Pragmatic Taxonomy of Toponyms with new guidelines. Part 2) Evaluation Data: shared via a dataset called GeoWebNews to provide test/train data to enable immediate use of our contributions. In addition to fine-grained Geotagging and Toponym Resolution (Geocoding), this dataset is also suitable for prototyping machine learning NLP models. Part 3) Metrics: discussed and reviewed for a rigorous evaluation with appropriate recommendations for NER/Geoparsing practitioners.
-
Which Melbourne? Augmenting Geocoding with Maps.
Association for Computational Linguistics 2018
ACL 2018 paper presented in Melbourne, Australia. The purpose of text geolocation is to associate geographic information contained in a document with a set (or sets) of coordinates, either implicitly by using linguistic features and/or explicitly by using geographic metadata combined with heuristics.
Topics: Geoparsing, Toponym Resolution, Geocoding, Named Entity Recognition, Information Retrieval, Machine Learning, Deep Learning. -
Vancouver Welcomes You! Minimalist Metonymy Resolution
Association for Computational Linguistics
Outstanding Paper Award 2017 @ ACL
We show how a minimalist neural approach combined with a novel predicate window method can achieve competitive results on the SemEval 2007 task on Metonymy Resolution. Additionally, we contribute with a new Wikipedia-based MR dataset called RelocaR, which is tailored towards locations as well as improving previous deficiencies in annotation guidelines. -
What’s missing in geographical parsing?
Springer - Language Resources and Evaluation
In this study, we evaluate and analyse the performance of a number of leading geoparsers on a number of corpora and highlight the challenges in detail. We also publish an automatically geotagged Wikipedia corpus to alleviate the dearth of (open source) corpora in this domain.
Honors & Awards
-
Outstanding Paper Award at www.acl2017.org
Association of Computational Linguistics
LINK: https://acl2017.wordpress.com/2017/06/02/wednesday-2-august-detailed-program/
TITLE: Vancouver Welcomes You! Minimalist Location Metonymy Resolution
RESOURCES: https://github.com/milangritta/Minimalist-Location-Metonymy-Resolution -
Natural Environment Research Council Doctoral Studentship
NERC Centre for Doctoral Training in Data, Risk and Environmental Analytical Methods
Bioinformatics, NLP, DTAL, MML Cambridge
-
Winners of AppathonUK 2014
Founders4Schools
Two day hackathon building apps for children to tackle the growing illiteracy problems. The app called Clever Creatures allows children (age 3 - 7) to practice spelling, reading, learn new words, do basic maths and win rewards for doing it.
Test Scores
-
IELTS
Score: 8 out of 9
International English Language Testing System for Academics
Languages
-
German
Limited working proficiency
-
Slovak
Native or bilingual proficiency
-
English
Native or bilingual proficiency
Recommendations received
1 person has recommended Milan
Join now to viewMore activity by Milan
-
Quick reminder: I'm charging $1,000/hour to fix your vibe-coded mess.
Quick reminder: I'm charging $1,000/hour to fix your vibe-coded mess.
Liked by Milan Gritta
-
People who don’t know this kind of stuff probably should not be commenting on the viability of coding being done entirely by machine.
People who don’t know this kind of stuff probably should not be commenting on the viability of coding being done entirely by machine.
Liked by Milan Gritta
-
Pizza won't motivate your developers. Beers won't motivate your developers. Ping-pong won't motivate your developers. Your developers don't need…
Pizza won't motivate your developers. Beers won't motivate your developers. Ping-pong won't motivate your developers. Your developers don't need…
Liked by Milan Gritta
-
I can’t get enough of this meme 😂. Are you down that path yet?
I can’t get enough of this meme 😂. Are you down that path yet?
Liked by Milan Gritta
-
Are you ready for a faster Hugging Face? 🏃♀️💨 Do you want to be among the first to try the future? We are replacing LFS with Xet, the HF Hub's…
Are you ready for a faster Hugging Face? 🏃♀️💨 Do you want to be among the first to try the future? We are replacing LFS with Xet, the HF Hub's…
Liked by Milan Gritta
-
At least 82 Britons are known to have died after taking drugs such as Ozempic and Mounjaro, according to the regulator https://lnkd.in/ePF6RYbe
At least 82 Britons are known to have died after taking drugs such as Ozempic and Mounjaro, according to the regulator https://lnkd.in/ePF6RYbe
Liked by Milan Gritta
-
A stunning essay in the Financial Times on the international decline in the ability to read, reason, focus, and learn new things. It began or…
A stunning essay in the Financial Times on the international decline in the ability to read, reason, focus, and learn new things. It began or…
Liked by Milan Gritta
-
STOP RAIDING OUR SMALLER BUSINESSES !!!! They’re at it again. This time, they’re coming for small business owners I know AGAIN AGAIN .. with a…
STOP RAIDING OUR SMALLER BUSINESSES !!!! They’re at it again. This time, they’re coming for small business owners I know AGAIN AGAIN .. with a…
Liked by Milan Gritta
-
𝗘𝘃𝗲𝗿𝘆𝗼𝗻𝗲 𝗰𝗮𝗻 𝗯𝘂𝗶𝗹𝗱 𝗮𝗽𝗽𝘀 𝗻𝗼𝘄! People with 𝘇𝗲𝗿𝗼 programming experience are now publishing apps. 🚀 AI can generate…
𝗘𝘃𝗲𝗿𝘆𝗼𝗻𝗲 𝗰𝗮𝗻 𝗯𝘂𝗶𝗹𝗱 𝗮𝗽𝗽𝘀 𝗻𝗼𝘄! People with 𝘇𝗲𝗿𝗼 programming experience are now publishing apps. 🚀 AI can generate…
Liked by Milan Gritta
-
Imagine interviewing a candidate who looks like a very strong coder. Almost extending an offer. But turns out, the candidate does not exist. This…
Imagine interviewing a candidate who looks like a very strong coder. Almost extending an offer. But turns out, the candidate does not exist. This…
Liked by Milan Gritta
-
𝗧𝗵𝗲 𝗟𝗶𝗺𝗶𝘁𝗮𝘁𝗶𝗼𝗻𝘀 𝗼𝗳 𝗦𝘂𝗺𝗺𝗮𝗿𝘆 𝗦𝘁𝗮𝘁𝗶𝘀𝘁𝗶𝗰𝘀! 📊💡 Datasaurus Dozen is a group of datasets that have identical descriptive…
𝗧𝗵𝗲 𝗟𝗶𝗺𝗶𝘁𝗮𝘁𝗶𝗼𝗻𝘀 𝗼𝗳 𝗦𝘂𝗺𝗺𝗮𝗿𝘆 𝗦𝘁𝗮𝘁𝗶𝘀𝘁𝗶𝗰𝘀! 📊💡 Datasaurus Dozen is a group of datasets that have identical descriptive…
Liked by Milan Gritta
-
Exciting first in-person project update meeting for our Innovate UK KTP project! Had a fantastic day at AstraZeneca's Discovery Centre in Cambridge,…
Exciting first in-person project update meeting for our Innovate UK KTP project! Had a fantastic day at AstraZeneca's Discovery Centre in Cambridge,…
Liked by Milan Gritta
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore More