“Gautier was an excellent student who developed and evaluated a p2p video-on-demand application, while at SICS. We tried to get him to stay on SICS to do a PhD, but he chose to move back to France. He has very good java programming skills. I strongly recommend him for any R&D position.”
Gautier Berthou
Stockholm, Stockholms län, Sverige
425 följare
413 kontakter
Aktivitet
-
Nu släpper vi lös vårt system för kartläggning och rapportering av klimatutsläpp! Tidigare endast för stora offentliga verksamheter – nu är Svalnas…
Nu släpper vi lös vårt system för kartläggning och rapportering av klimatutsläpp! Tidigare endast för stora offentliga verksamheter – nu är Svalnas…
Gillat av Gautier Berthou
-
Amazing AI Summit event yesterday at OVHcloud in their STATION F offices. It was a very intense - action packed event and am glad we got invited and…
Amazing AI Summit event yesterday at OVHcloud in their STATION F offices. It was a very intense - action packed event and am glad we got invited and…
Gillat av Gautier Berthou
Erfarenhet
Utbildning
-
Université Grenoble Alpes
–
This Ph.D. focused on information dissemination in computer networks. I studied two aspects of this topic: anonymous communication on Internet with rational nodes and uniform total order broadcast in a computer cluster. Concerning the first aspect, I observed that no anonymous communication protocol is capable of working with rational nodes while scaling existed. Therefore, I proposed RAC, the first anonymous communication protocol functioning with rational nodes and able of scaling. Concerning…
This Ph.D. focused on information dissemination in computer networks. I studied two aspects of this topic: anonymous communication on Internet with rational nodes and uniform total order broadcast in a computer cluster. Concerning the first aspect, I observed that no anonymous communication protocol is capable of working with rational nodes while scaling existed. Therefore, I proposed RAC, the first anonymous communication protocol functioning with rational nodes and able of scaling. Concerning the second aspect, I observed that no existing uniform total order broadcast protocol is capable of ensuring both a good latency and an optimal throughput. In order to fill this lack I proposed FastCast, the first uniform total order broadcast ensuring both an optimal throughput and a low latency.
-
–
Aktiviteter och föreningar:PromenadorQuestern
-
–
Aktiviteter och föreningar:grand cercle 2007-2008
TOFL (ibt) : 95/120
-
–
Aktiviteter och föreningar:grand cercle 2007-2008
ENSIMAG
TOFL (ibt) : 95/120
Publikationer
-
HopsFS-S3: Extending Object Stores with POSIX-like Semantics and more
In Proccedings of the 21st International Middleware Conference Industrial Track
Object stores have become the de-facto platform for storage in the cloud due to their scalability, high availability, and low cost. However, they provide weaker metadata semantics and lower performance compared to distributed hierarchical file systems. In this paper, we introduce HopsFS-S3, a hybrid distributed hierarchical file system backed by an object store while preserving the file system's strong consistency semantics. We base our implementation on HopsFS, a next-generation distribution…
Object stores have become the de-facto platform for storage in the cloud due to their scalability, high availability, and low cost. However, they provide weaker metadata semantics and lower performance compared to distributed hierarchical file systems. In this paper, we introduce HopsFS-S3, a hybrid distributed hierarchical file system backed by an object store while preserving the file system's strong consistency semantics. We base our implementation on HopsFS, a next-generation distribution of HDFS with distributed metadata. We redesigned HopsFS' block storage layer to transparently use an object store to store the file's blocks without sacrificing the file system's semantics. We also introduced a new block caching service to leverage faster NVMe storage for hot blocks. In our experiments, we show that HopsFS-S3 outperforms EMRFS for IO-bound workloads, with up to 20% higher performance and delivers up to 3 AX the aggregated read throughput of EMRFS. Moreover, we demonstrate that metadata operations on HopsFS-S3 (such as directory rename) are up to two orders of magnitude faster than EMRFS. Finally, HopsFS-S3 opens up the currently closed metadata in object stores, enabling correctly-ordered change notifications with HopsFS' change data capture (CDC) API and customized extensions to metadata.
Övriga författareVisa publikation -
Hopsworks: Improving user experience and development on hadoop with scalable, strongly consistent metadata
In Proccedings of the IEEE 37th International Conference on Distributed Computing Systems (ICDCS)
Hadoop is a popular system for storing, managing,and processing large volumes of data, but it has bare-bonesinternal support for metadata, as metadata is a bottleneck andless means more scalability. The result is a scalable platform withrudimentary access control that is neither user- nor developer-friendly. Also, metadata services that are built on Hadoop, suchas SQL-on-Hadoop, access control, data provenance, and datagovernance are necessarily implemented as eventually consistentservices…
Hadoop is a popular system for storing, managing,and processing large volumes of data, but it has bare-bonesinternal support for metadata, as metadata is a bottleneck andless means more scalability. The result is a scalable platform withrudimentary access control that is neither user- nor developer-friendly. Also, metadata services that are built on Hadoop, suchas SQL-on-Hadoop, access control, data provenance, and datagovernance are necessarily implemented as eventually consistentservices, resulting in increased development effort and morebrittle software. In this paper, we present a new project-based multi-tenancymodel for Hadoop, built on a new distribution of Hadoopthat provides a distributed database backend for the HadoopDistributed Filesystem's (HDFS) metadata layer. We extendHadoop's metadata model to introduce projects, datasets, andproject-users as new core concepts that enable a user-friendly, UI-driven Hadoop experience. As our metadata service is backed bya transactional database, developers can easily extend metadataby adding new tables and ensure the strong consistency ofextended metadata using both transactions and foreign keys.
Övriga författareVisa publikation -
Leader election using NewSQL database systems
In Proccedings of the IFIP International Conference on Distributed Applications and Interoperable Systems
Leader election protocols are a fundamental building block for replicated distributed services. They ease the design of leader-based coordination protocols that tolerate failures. In partially synchronous systems, designing a leader election algorithm, that does not permit multiple leaders while the system is unstable, is a complex task. As a result many production systems use third-party distributed coordination services, such as ZooKeeper and Chubby, to provide a reliable leader election…
Leader election protocols are a fundamental building block for replicated distributed services. They ease the design of leader-based coordination protocols that tolerate failures. In partially synchronous systems, designing a leader election algorithm, that does not permit multiple leaders while the system is unstable, is a complex task. As a result many production systems use third-party distributed coordination services, such as ZooKeeper and Chubby, to provide a reliable leader election service. However, adding a third-party service such as ZooKeeper to a distributed system incurs additional operational costs and complexity. ZooKeeper instances must be kept running on at least three machines to ensure its high availability. In this paper, we present a novel leader election protocol using NewSQL databases for partially synchronous systems, that ensures at most one leader at any given time. The leader election protocol uses the database as distributed shared memory. Our work enables distributed systems that already use NewSQL databases to save the operational overhead of managing an additional third-party service for leader election. Our main contribution is the design, implementation and validation of a practical leader election algorithm, based on NewSQL databases, that has performance comparable to a leader election implementation using a state-of-the-art distributed coordination service, ZooKeeper.
Övriga författareVisa publikation -
FastCast: A Throughput- and Latency-Efficient Total Order Broadcast Protocol
Middleware 2013
Many uniform total order broadcast protocols have been designed in the last 30 years. Unfortunately, none of them achieves both optimal throughput and low latency. Indeed, protocols achieving optimal throughput rely on a ring dissemination pattern, which induces high latencies. Protocols achieving low latency rely on IP multicast and fail to achieve good throughput because of message losses. In this paper, we describe FastCast, the first protocol that achieves both optimal throughput and low…
Many uniform total order broadcast protocols have been designed in the last 30 years. Unfortunately, none of them achieves both optimal throughput and low latency. Indeed, protocols achieving optimal throughput rely on a ring dissemination pattern, which induces high latencies. Protocols achieving low latency rely on IP multicast and fail to achieve good throughput because of message losses. In this paper, we describe FastCast, the first protocol that achieves both optimal throughput and low latency. To achieve low latency, FastCast relies on IP multicast. To achieve optimal throughput, FastCast defines a protocol responsible for dynamically computing the throughput at which processes can send IP multicast messages. Thanks to this dynamic bandwidth allocation protocol, FastCast allows multiple processes to simultaneously send messages, while avoiding message losses. An evaluation of FastCast on a cluster of 8 machines shows that it indeed achieves optimal throughput and a very low latency.
Övriga författareVisa publikation -
RAC : a Freerider-resilient, Scalable, Anonymous Communication Protocol
ICDCS 2013
Enabling anonymous communication over the Internet is crucial. The first protocols that have been devised for anonymous communication are subject to freeriding. Recent protocols have thus been proposed to deal with this issue. However, these protocols do not scale to large systems, and some of them further assume the existence of trusted servers. In this paper, we present RAC, the first anonymous communication protocol that tolerates freeriders and that scales to large systems. Scalability…
Enabling anonymous communication over the Internet is crucial. The first protocols that have been devised for anonymous communication are subject to freeriding. Recent protocols have thus been proposed to deal with this issue. However, these protocols do not scale to large systems, and some of them further assume the existence of trusted servers. In this paper, we present RAC, the first anonymous communication protocol that tolerates freeriders and that scales to large systems. Scalability comes from the fact that the complexity of RAC in terms of the number of message exchanges is independent from the number of nodes in the system. Another important aspect of RAC is that it does not rely on any trusted third party. We theoretically prove, using game theory, that our protocol is a Nash equilibrium, i.e, that freeriders have no interest in deviating from the protocol. Further, we experimentally evaluate RAC using simulations. Our evaluation shows that, whatever the size of the system (up to 100.000 nodes), the nodes participating in the system observe the same throughput.
Övriga författareVisa publikation -
P2P VoD using the self-organizing gradient overlay network
Proceedings of the second international workshop on Self-organizing architectures
Peer-to-peer (P2P) video-on-demand (VoD) requires that nodes collaborate in the downloading of video files as a number of file pieces. In general for VoD, a node is only interested in another node's video file pieces if its download position in the video file precedes the download position of the other node. In this paper, we capture this neighbour relation using the Gradient overlay network, a gossip-generated P2P topology. The Gradient network topology self-organizes into logical concentric…
Peer-to-peer (P2P) video-on-demand (VoD) requires that nodes collaborate in the downloading of video files as a number of file pieces. In general for VoD, a node is only interested in another node's video file pieces if its download position in the video file precedes the download position of the other node. In this paper, we capture this neighbour relation using the Gradient overlay network, a gossip-generated P2P topology. The Gradient network topology self-organizes into logical concentric rings, such that nodes at earlier download positions in the file are found at increasing distances from the centre, while nodes that have downloaded the whole file are located in the centre of the topology. We build a P2PVoD protocol that uses nodes sampled from the Gradient overlay, and we evaluate its performance in simulation. We also present the layered gossiping architecture and discuss the role of self-organizing mechanisms in the Gradient overlay network's architecture, including positive and negative feedback, decay, and external events from the environment.
Övriga författareVisa publikation
Språk
-
French
Modersmåls- eller tvåspråkig nivå
-
English
Professionell yrkeskunskap
-
Swedish
Grundläggande kunskaper
Mottagna rekommendationer
1 person har rekommenderat Gautier
Gå med nu för att seFler aktiviteter efter Gautier
-
I read a lot about what Europe should focus on in AI recently, and thought I would throw my 2 cents in the ring (so to speak :) ). The open-source…
I read a lot about what Europe should focus on in AI recently, and thought I would throw my 2 cents in the ring (so to speak :) ). The open-source…
Gillat av Gautier Berthou
-
When we forked MySQL Cluster at Hopsworks to build RonDB, we wanted to make the world's fastest database easier to configure and install - a managed…
When we forked MySQL Cluster at Hopsworks to build RonDB, we wanted to make the world's fastest database easier to configure and install - a managed…
Gillat av Gautier Berthou
-
ML Project Idea 💡 Let's predict flight delays ✈️↓ 𝗧𝗵𝗲 𝗽𝗿𝗼𝗯𝗹𝗲𝗺 Let's build a Machine Learning system to predict flight delays at…
ML Project Idea 💡 Let's predict flight delays ✈️↓ 𝗧𝗵𝗲 𝗽𝗿𝗼𝗯𝗹𝗲𝗺 Let's build a Machine Learning system to predict flight delays at…
Gillat av Gautier Berthou
-
"Every millisecond represents millions," is something we often hear from customers and partners. Real-time AI capabilities empower the most…
"Every millisecond represents millions," is something we often hear from customers and partners. Real-time AI capabilities empower the most…
Gillat av Gautier Berthou
-
We had an amazing time at PyData London this weekend! 🎉 We connected with a fantastic community of data enthusiasts and had engaging conversations…
We had an amazing time at PyData London this weekend! 🎉 We connected with a fantastic community of data enthusiasts and had engaging conversations…
Gillat av Gautier Berthou
-
Rockset just got acquired by Open AI. Rockset provides a data platform for real-time data queries - important for RAG: "Usage at scale: With global…
Rockset just got acquired by Open AI. Rockset provides a data platform for real-time data queries - important for RAG: "Usage at scale: With global…
Gillat av Gautier Berthou
-
How to architect AI systems is not a very sexy topic, but is crucial to the success of AI projects. Modularity is a key mechanism for decomposing a…
How to architect AI systems is not a very sexy topic, but is crucial to the success of AI projects. Modularity is a key mechanism for decomposing a…
Gillat av Gautier Berthou
-
Planning for your first model. Designing your first model. Gettting your first model in production. Running your 5th model in production. How a…
Planning for your first model. Designing your first model. Gettting your first model in production. Running your 5th model in production. How a…
Gillat av Gautier Berthou
Andra liknande profiler
Andra med namnet Gautier Berthou
1 person till med namnet Gautier Berthou är medlem på LinkedIn
Se andra med namnet Gautier Berthou