Milan Gritta

London, England, United Kingdom
2K followers 500+ connections

View mutual connections with Milan

Welcome back

Email or phone

Password

Forgot password?

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Join to view profile

Huawei Technologies Research & Development (UK) Ltd

University of Cambridge

About

A mixture of Science, Engineering, Business and Leadership skills, I prefer companies…

Activity

Google DeepMind and Massachusetts Institute of Technology present UniFluid, an autoregressive framework that unifies visual generation and…

Google DeepMind and Massachusetts Institute of Technology present UniFluid, an autoregressive framework that unifies visual generation and…

Liked by Milan Gritta
2025 % of all code written by AI = 50% 2026 % of all code written by AI = 100% This was Anthropic CEO's prediction I say Accurate Prediction 2025 %…

2025 % of all code written by AI = 50% 2026 % of all code written by AI = 100% This was Anthropic CEO's prediction I say Accurate Prediction 2025 %…

Liked by Milan Gritta
✂️ AutoAbliteration I made a Colab notebook to automatically abliterate models in minutes. It's quite general, so you can do interesting stuff like…

✂️ AutoAbliteration I made a Colab notebook to automatically abliterate models in minutes. It's quite general, so you can do interesting stuff like…

Liked by Milan Gritta

Join now to see all activity

Experience

Huawei Technologies Research & Development (UK) Ltd

London, United Kingdom
-

Brighton, United Kingdom
-

Brighton, United Kingdom

Education

University of Cambridge

2015 - 2018

Activities and Societies: Data Science, Natural Language Processing, AI, Linguistics, Applied Machine Learning, Leadership, Philosophy, Human Behaviour Sciences

Panda Alert - Bioinformatics, NLP, Deep Machine Learning Research, Information Extraction
2014 - 2015

Activities and Societies: Cambridge University Entrepreneurs, 80000 hours, President of Fitzwilliam College Entrepreneurs Society, Cambridge University Technology and Enterprise Club
2010 - 2014

Activities and Societies: Learning to Lead Scheme, Lead a team of 6 to win the software engineering project ahead of almost 20 other student teams.

Publications

Multi3WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for Training and Evaluating Culturally Adapted Task-Oriented Dialog Systems

Transactions of the Association for Computational Linguistics November 16, 2023

Creating high-quality annotated data for task-oriented dialog (ToD) is known to be notoriously difficult, and the challenges are amplified when the goal is to create equitable, culturally adapted, and large-scale ToD datasets for multiple languages. Therefore, the current datasets are still very scarce and suffer from limitations such as translation-based non-native dialogs with translation artefacts, small scale, or lack of cultural adaptation, among others. In this work, we first take stock…

Creating high-quality annotated data for task-oriented dialog (ToD) is known to be notoriously difficult, and the challenges are amplified when the goal is to create equitable, culturally adapted, and large-scale ToD datasets for multiple languages. Therefore, the current datasets are still very scarce and suffer from limitations such as translation-based non-native dialogs with translation artefacts, small scale, or lack of cultural adaptation, among others. In this work, we first take stock of the current landscape of multilingual ToD datasets, offering a systematic overview of their properties and limitations. Aiming to reduce all the detected limitations, we then introduce Multi3WOZ, a novel multilingual, multi-domain, multi-parallel ToD dataset. It is large-scale and offers culturally adapted dialogs in 4 languages to enable training and evaluation of multilingual and cross-lingual ToD systems. We describe a complex bottom–up data collection process that yielded the final dataset, and offer the first sets of baseline scores across different ToD-related tasks for future reference, also highlighting its challenging nature.

See publication
A Systematic Study of Performance Disparities in Multilingual Task-Oriented Dialogue Systems

October 12, 2023

Achieving robust language technologies that can perform well across the world's many languages is a central goal of multilingual NLP. In this work, we take stock of and empirically analyse task performance disparities that exist between multilingual task-oriented dialogue (ToD) systems. We first define new quantitative measures of absolute and relative equivalence in system performance, capturing disparities across languages and within individual languages. Through a series of controlled…

Achieving robust language technologies that can perform well across the world's many languages is a central goal of multilingual NLP. In this work, we take stock of and empirically analyse task performance disparities that exist between multilingual task-oriented dialogue (ToD) systems. We first define new quantitative measures of absolute and relative equivalence in system performance, capturing disparities across languages and within individual languages. Through a series of controlled experiments, we demonstrate that performance disparities depend on a number of factors: the nature of the ToD task at hand, the underlying pretrained language model, the target language, and the amount of ToD annotated data. We empirically prove the existence of the adaptation and intrinsic biases in current ToD systems: e.g., ToD systems trained for Arabic or Turkish using annotated ToD data fully parallel to English ToD data still exhibit diminished ToD task performance. Beyond providing a series of insights into the performance disparities of ToD systems in different languages, our analyses offer practical tips on how to approach ToD data collection and system development for new languages.

See publication
Pangu-coder: Program synthesis with function-level language modeling

arXiv preprint arXiv:2207.11280 July 22, 2022

We present PanGu-Coder, a pretrained decoder-only language model adopting the PanGu-Alpha architecture for text-to-code generation, i.e. the synthesis of programming language solutions given a natural language problem description. We train PanGu-Coder using a two-stage strategy: the first stage employs Causal Language Modelling (CLM) to pre-train on raw programming language data, while the second stage uses a combination of Causal Language Modelling and Masked Language Modelling (MLM) training…

We present PanGu-Coder, a pretrained decoder-only language model adopting the PanGu-Alpha architecture for text-to-code generation, i.e. the synthesis of programming language solutions given a natural language problem description. We train PanGu-Coder using a two-stage strategy: the first stage employs Causal Language Modelling (CLM) to pre-train on raw programming language data, while the second stage uses a combination of Causal Language Modelling and Masked Language Modelling (MLM) training objectives that focus on the downstream task of text-to-code generation and train on loosely curated pairs of natural language program definitions and code functions. Finally, we discuss PanGu-Coder-FT, which is fine-tuned on a combination of competitive programming problems and code with continuous integration tests. We evaluate PanGu-Coder with a focus on whether it generates functionally correct programs and demonstrate that it achieves equivalent or better performance than similarly sized models, such as CodeX, while attending a smaller context window and training on less data.

See publication
CrossAligner & Co: Zero-Shot Transfer Methods for Task-Oriented Cross-lingual Natural Language Understanding

Findings of the Association for Computational Linguistics: ACL 2022 March 18, 2022

Task-oriented personal assistants enable people to interact with a host of devices and services using natural language. One of the challenges of making neural dialogue systems available to more users is the lack of training data for all but a few languages. Zero-shot methods try to solve this issue by acquiring task knowledge in a high-resource language such as English with the aim of transferring it to the low-resource language(s). To this end, we introduce CrossAligner, the principal method…

Task-oriented personal assistants enable people to interact with a host of devices and services using natural language. One of the challenges of making neural dialogue systems available to more users is the lack of training data for all but a few languages. Zero-shot methods try to solve this issue by acquiring task knowledge in a high-resource language such as English with the aim of transferring it to the low-resource language(s). To this end, we introduce CrossAligner, the principal method of a variety of effective approaches for zero-shot cross-lingual transfer based on learning alignment from unlabelled parallel data. We present a quantitative analysis of individual methods as well as their weighted combinations, several of which exceed state-of-the-art (SOTA) scores as evaluated across nine languages, fifteen test sets and three benchmark multilingual datasets. A detailed qualitative error analysis of the best methods shows that our fine-tuned language models can zero-shot transfer the task knowledge better than anticipated.

See publication
XeroAlign: Zero-Shot Cross-lingual Transformer Alignment

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 May 6, 2021

The introduction of pretrained cross-lingual language models brought decisive improvements to multilingual NLP tasks. However, the lack of labelled task data necessitates a variety of methods aiming to close the gap to high-resource languages. Zero-shot methods in particular, often use translated task data as a training signal to bridge the performance gap between the source and target language(s). We introduce XeroAlign, a simple method for task-specific alignment of cross-lingual pretrained…

The introduction of pretrained cross-lingual language models brought decisive improvements to multilingual NLP tasks. However, the lack of labelled task data necessitates a variety of methods aiming to close the gap to high-resource languages. Zero-shot methods in particular, often use translated task data as a training signal to bridge the performance gap between the source and target language(s). We introduce XeroAlign, a simple method for task-specific alignment of cross-lingual pretrained transformers such as XLM-R. XeroAlign uses translated task data to encourage the model to generate similar sentence embeddings for different languages. The XeroAligned XLM-R, called XLM-RA, shows strong improvements over the baseline models to achieve state-of-the-art zero-shot results on three multilingual natural language understanding tasks. XLM-RA's text classification accuracy exceeds that of XLM-R trained with labelled data and performs on par with state-of-the-art models on a cross-lingual adversarial paraphrasing task.

See publication
Conversation Graph: Data Augmentation, Training and Evaluation for Non-Deterministic Dialogue Management

Transactions of the Association of Computational Linguistics February 1, 2021

Task-oriented dialogue systems typically rely on large amounts of high-quality training data or require complex handcrafted rules. However, existing datasets are often limited in size con- sidering the complexity of the dialogues. Additionally, conventional training signal in- ference is not suitable for non-deterministic agent behavior, namely, considering multiple actions as valid in identical dialogue states. We propose the Conversation Graph (ConvGraph), a graph-based representation of…

Task-oriented dialogue systems typically rely on large amounts of high-quality training data or require complex handcrafted rules. However, existing datasets are often limited in size con- sidering the complexity of the dialogues. Additionally, conventional training signal in- ference is not suitable for non-deterministic agent behavior, namely, considering multiple actions as valid in identical dialogue states. We propose the Conversation Graph (ConvGraph), a graph-based representation of dialogues that can be exploited for data augmentation, multi- reference training and evaluation of non- deterministic agents. ConvGraph generates novel dialogue paths to augment data volume and diversity. Intrinsic and extrinsic evaluation across three datasets shows that data augmentation and/or multi-reference training with ConvGraph can improve dialogue success rates by up to 6.4%.

See publication
A Pragmatic Guide to Geoparsing Evaluation

Under review @ Springer (Language Resources and Evaluation) December 1, 2018

The manuscript introduces such framework in three parts. Part 1) Task Definition: clarified via corpus linguistic analysis proposing a fine-grained Pragmatic Taxonomy of Toponyms with new guidelines. Part 2) Evaluation Data: shared via a dataset called GeoWebNews to provide test/train data to enable immediate use of our contributions. In addition to fine-grained Geotagging and Toponym Resolution (Geocoding), this dataset is also suitable for prototyping machine learning NLP models. Part 3)…

The manuscript introduces such framework in three parts. Part 1) Task Definition: clarified via corpus linguistic analysis proposing a fine-grained Pragmatic Taxonomy of Toponyms with new guidelines. Part 2) Evaluation Data: shared via a dataset called GeoWebNews to provide test/train data to enable immediate use of our contributions. In addition to fine-grained Geotagging and Toponym Resolution (Geocoding), this dataset is also suitable for prototyping machine learning NLP models. Part 3) Metrics: discussed and reviewed for a rigorous evaluation with appropriate recommendations for NER/Geoparsing practitioners.

See publication
Which Melbourne? Augmenting Geocoding with Maps.

Association for Computational Linguistics 2018 July 1, 2018

ACL 2018 paper presented in Melbourne, Australia. The purpose of text geolocation is to associate geographic information contained in a document with a set (or sets) of coordinates, either implicitly by using linguistic features and/or explicitly by using geographic metadata combined with heuristics.
Topics: Geoparsing, Toponym Resolution, Geocoding, Named Entity Recognition, Information Retrieval, Machine Learning, Deep Learning.

See publication
Vancouver Welcomes You! Minimalist Metonymy Resolution

Association for Computational Linguistics August 2, 2017

Outstanding Paper Award 2017 @ ACL
We show how a minimalist neural approach combined with a novel predicate window method can achieve competitive results on the SemEval 2007 task on Metonymy Resolution. Additionally, we contribute with a new Wikipedia-based MR dataset called RelocaR, which is tailored towards locations as well as improving previous deficiencies in annotation guidelines.

See publication
What’s missing in geographical parsing?

Springer - Language Resources and Evaluation Jan 2017

In this study, we evaluate and analyse the performance of a number of leading geoparsers on a number of corpora and highlight the challenges in detail. We also publish an automatically geotagged Wikipedia corpus to alleviate the dearth of (open source) corpora in this domain.

See publication

Join now to see all publications

Honors & Awards

Outstanding Paper Award at www.acl2017.org

Association of Computational Linguistics

Aug 2017

LINK: https://acl2017.wordpress.com/2017/06/02/wednesday-2-august-detailed-program/

TITLE: Vancouver Welcomes You! Minimalist Location Metonymy Resolution

RESOURCES: https://github.com/milangritta/Minimalist-Location-Metonymy-Resolution
Natural Environment Research Council Doctoral Studentship

NERC Centre for Doctoral Training in Data, Risk and Environmental Analytical Methods

Aug 2015

Bioinformatics, NLP, DTAL, MML Cambridge
Winners of AppathonUK 2014

Founders4Schools

Nov 2014

Two day hackathon building apps for children to tackle the growing illiteracy problems. The app called Clever Creatures allows children (age 3 - 7) to practice spelling, reading, learn new words, do basic maths and win rewards for doing it.

Test Scores

IELTS

Score: 8 out of 9

Jun 2010

International English Language Testing System for Academics

Languages

German

Limited working proficiency
Slovak

Native or bilingual proficiency
English

Native or bilingual proficiency

Recommendations received

Stuart Maddocks

“Milan has recently completed The Learning to Lead Programme at Sussex and has proved to be a dedicated and focussed learner. He has continually been a key contributor to discussions and group exercises, with a keen eye on self development. His capacity for assimilating new learning in to his practices is strong and I always found him to be very professional and enjoyable to work with. I wish him well in his future and feel he will make a worthy addition to any team or environment that he joins moving forward.”

1 person has recommended Milan

Join now to view

More activity by Milan

Quick reminder: I'm charging $1,000/hour to fix your vibe-coded mess.

Quick reminder: I'm charging $1,000/hour to fix your vibe-coded mess.

Liked by Milan Gritta
People who don’t know this kind of stuff probably should not be commenting on the viability of coding being done entirely by machine.

People who don’t know this kind of stuff probably should not be commenting on the viability of coding being done entirely by machine.

Liked by Milan Gritta
Pizza won't motivate your developers. Beers won't motivate your developers. Ping-pong won't motivate your developers. Your developers don't need…

Pizza won't motivate your developers. Beers won't motivate your developers. Ping-pong won't motivate your developers. Your developers don't need…

Liked by Milan Gritta
I can’t get enough of this meme 😂. Are you down that path yet?

I can’t get enough of this meme 😂. Are you down that path yet?

Liked by Milan Gritta
Are you ready for a faster Hugging Face? 🏃‍♀️💨 Do you want to be among the first to try the future? We are replacing LFS with Xet, the HF Hub's…

Are you ready for a faster Hugging Face? 🏃‍♀️💨 Do you want to be among the first to try the future? We are replacing LFS with Xet, the HF Hub's…

Liked by Milan Gritta
At least 82 Britons are known to have died after taking drugs such as Ozempic and Mounjaro, according to the regulator https://lnkd.in/ePF6RYbe

At least 82 Britons are known to have died after taking drugs such as Ozempic and Mounjaro, according to the regulator https://lnkd.in/ePF6RYbe

Liked by Milan Gritta
A stunning essay in the Financial Times on the international decline in the ability to read, reason, focus, and learn new things. It began or…

A stunning essay in the Financial Times on the international decline in the ability to read, reason, focus, and learn new things. It began or…

Liked by Milan Gritta
STOP RAIDING OUR SMALLER BUSINESSES !!!! They’re at it again. This time, they’re coming for small business owners I know AGAIN AGAIN .. with a…

STOP RAIDING OUR SMALLER BUSINESSES !!!! They’re at it again. This time, they’re coming for small business owners I know AGAIN AGAIN .. with a…

Liked by Milan Gritta
𝗘𝘃𝗲𝗿𝘆𝗼𝗻𝗲 𝗰𝗮𝗻 𝗯𝘂𝗶𝗹𝗱 𝗮𝗽𝗽𝘀 𝗻𝗼𝘄! People with 𝘇𝗲𝗿𝗼 programming experience are now publishing apps. 🚀 AI can generate…

𝗘𝘃𝗲𝗿𝘆𝗼𝗻𝗲 𝗰𝗮𝗻 𝗯𝘂𝗶𝗹𝗱 𝗮𝗽𝗽𝘀 𝗻𝗼𝘄! People with 𝘇𝗲𝗿𝗼 programming experience are now publishing apps. 🚀 AI can generate…

Liked by Milan Gritta
Imagine interviewing a candidate who looks like a very strong coder. Almost extending an offer. But turns out, the candidate does not exist. This…

Imagine interviewing a candidate who looks like a very strong coder. Almost extending an offer. But turns out, the candidate does not exist. This…

Liked by Milan Gritta
𝗧𝗵𝗲 𝗟𝗶𝗺𝗶𝘁𝗮𝘁𝗶𝗼𝗻𝘀 𝗼𝗳 𝗦𝘂𝗺𝗺𝗮𝗿𝘆 𝗦𝘁𝗮𝘁𝗶𝘀𝘁𝗶𝗰𝘀! 📊💡 Datasaurus Dozen is a group of datasets that have identical descriptive…

𝗧𝗵𝗲 𝗟𝗶𝗺𝗶𝘁𝗮𝘁𝗶𝗼𝗻𝘀 𝗼𝗳 𝗦𝘂𝗺𝗺𝗮𝗿𝘆 𝗦𝘁𝗮𝘁𝗶𝘀𝘁𝗶𝗰𝘀! 📊💡 Datasaurus Dozen is a group of datasets that have identical descriptive…

Liked by Milan Gritta
Exciting first in-person project update meeting for our Innovate UK KTP project! Had a fantastic day at AstraZeneca's Discovery Centre in Cambridge,…

Exciting first in-person project update meeting for our Innovate UK KTP project! Had a fantastic day at AstraZeneca's Discovery Centre in Cambridge,…

Liked by Milan Gritta

View Milan’s full profile

See who you know in common
Get introduced
Contact Milan directly

Join to view full profile

Other similar profiles

Explore more posts

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Add new skills with these courses

See all courses

Milan Gritta

London, England, United Kingdom 2K followers 500+ connections

About

Activity

Google DeepMind and Massachusetts Institute of Technology present UniFluid, an autoregressive framework that unifies visual generation and…

Liked by Milan Gritta

2025 % of all code written by AI = 50% 2026 % of all code written by AI = 100% This was Anthropic CEO's prediction I say Accurate Prediction 2025 %…

Liked by Milan Gritta

✂️ AutoAbliteration I made a Colab notebook to automatically abliterate models in minutes. It's quite general, so you can do interesting stuff like…

Liked by Milan Gritta

Experience

-

-

Education

Publications

Transactions of the Association for Computational Linguistics November 16, 2023

October 12, 2023

arXiv preprint arXiv:2207.11280 July 22, 2022

Findings of the Association for Computational Linguistics: ACL 2022 March 18, 2022

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 May 6, 2021

Transactions of the Association of Computational Linguistics February 1, 2021

Under review @ Springer (Language Resources and Evaluation) December 1, 2018

Association for Computational Linguistics 2018 July 1, 2018

Association for Computational Linguistics August 2, 2017

Springer - Language Resources and Evaluation Jan 2017

Honors & Awards

Outstanding Paper Award at www.acl2017.org

Association of Computational Linguistics

Natural Environment Research Council Doctoral Studentship

NERC Centre for Doctoral Training in Data, Risk and Environmental Analytical Methods

Winners of AppathonUK 2014

Founders4Schools

Test Scores

IELTS

Score: 8 out of 9

Languages

German

Limited working proficiency

Slovak

Native or bilingual proficiency

English

Native or bilingual proficiency

Recommendations received

Stuart Maddocks

More activity by Milan

Quick reminder: I'm charging $1,000/hour to fix your vibe-coded mess.

Liked by Milan Gritta

People who don’t know this kind of stuff probably should not be commenting on the viability of coding being done entirely by machine.

Liked by Milan Gritta

Pizza won't motivate your developers. Beers won't motivate your developers. Ping-pong won't motivate your developers. Your developers don't need…

Liked by Milan Gritta

I can’t get enough of this meme 😂. Are you down that path yet?

Liked by Milan Gritta

Are you ready for a faster Hugging Face? 🏃‍♀️💨 Do you want to be among the first to try the future? We are replacing LFS with Xet, the HF Hub's…

Liked by Milan Gritta

At least 82 Britons are known to have died after taking drugs such as Ozempic and Mounjaro, according to the regulator https://lnkd.in/ePF6RYbe

Liked by Milan Gritta

A stunning essay in the Financial Times on the international decline in the ability to read, reason, focus, and learn new things. It began or…

Liked by Milan Gritta

STOP RAIDING OUR SMALLER BUSINESSES !!!! They’re at it again. This time, they’re coming for small business owners I know AGAIN AGAIN .. with a…

Liked by Milan Gritta

𝗘𝘃𝗲𝗿𝘆𝗼𝗻𝗲 𝗰𝗮𝗻 𝗯𝘂𝗶𝗹𝗱 𝗮𝗽𝗽𝘀 𝗻𝗼𝘄! People with 𝘇𝗲𝗿𝗼 programming experience are now publishing apps. 🚀 AI can generate…

Liked by Milan Gritta

Imagine interviewing a candidate who looks like a very strong coder. Almost extending an offer. But turns out, the candidate does not exist. This…

Liked by Milan Gritta

𝗧𝗵𝗲 𝗟𝗶𝗺𝗶𝘁𝗮𝘁𝗶𝗼𝗻𝘀 𝗼𝗳 𝗦𝘂𝗺𝗺𝗮𝗿𝘆 𝗦𝘁𝗮𝘁𝗶𝘀𝘁𝗶𝗰𝘀! 📊💡 Datasaurus Dozen is a group of datasets that have identical descriptive…

Liked by Milan Gritta

Exciting first in-person project update meeting for our Innovate UK KTP project! Had a fantastic day at AstraZeneca's Discovery Centre in Cambridge,…

Liked by Milan Gritta

View Milan’s full profile

Other similar profiles

Martin Riedmiller

Yaodong Yang

Rishabh Mehrotra

Audrūnas Gruslys

Marta Garnelo

Miriam Redi

Girmaw Abebe Tadesse

Katerina Menelaou

Dan Busbridge

London, England, United Kingdom
2K followers 500+ connections