Activity
-
I only recently switched to an iPhone and discovered Apple’s default auto text replacement of ‘Omw’ to “On my way!”. What’s amusing is that this is…
I only recently switched to an iPhone and discovered Apple’s default auto text replacement of ‘Omw’ to “On my way!”. What’s amusing is that this is…
Liked by Hongkai Wu
-
We are hiring! ******************************************* StockX is the next generation of e-commerce for the next generation of consumer. StockX’s…
We are hiring! ******************************************* StockX is the next generation of e-commerce for the next generation of consumer. StockX’s…
Posted by Hongkai Wu
-
I am a refugee. My parents sacrificed an established life in Vietnam to allow my sister and I to pursue an American education. They experienced…
I am a refugee. My parents sacrificed an established life in Vietnam to allow my sister and I to pursue an American education. They experienced…
Liked by Hongkai Wu
Experience
-
Pilot Studio Inc
-
-
-
-
-
-
-
-
-
-
-
-
Education
Publications
-
Minerva II: A Novel Entity Discovery Tool
ACM SIGCHI
Entity discovery is a long-lasting interest in governments, enterprises, and the research community. It is a complex task that requires retrieving, extracting, linking, and displaying entities. Algorithms to support entity discovery have been proposed across disciplines including Information Retrieval (IR), Information Extraction (IE), Natural Language Processing (NLP), and Data Mining (DM). However, there is little study on User Interface (UI) for supporting effective entity discovery. This…
Entity discovery is a long-lasting interest in governments, enterprises, and the research community. It is a complex task that requires retrieving, extracting, linking, and displaying entities. Algorithms to support entity discovery have been proposed across disciplines including Information Retrieval (IR), Information Extraction (IE), Natural Language Processing (NLP), and Data Mining (DM). However, there is little study on User Interface (UI) for supporting effective entity discovery. This paper presents Minerva II, a novel entity discovery tool, to tackle this challenge. In the paper, we illustrate the UI design and how it effectively supports the typical work flow when a user performs entity discovery. We also describe a new visualization algorithm for entity networks. Our user study shows that Minerva II is able to greatly increase users' efficiency.
Other authorsSee publication -
Modeling Search Engine's Explorations in Dynamic Search: An Ontological Perspective.
M.S. Thesis. Georgetown University. DC, USA.
Dynamic search is an information retrieval task, in which information systems retrieve documents for a user’s multiple queries. Each query starts a search iteration and aims to fulfill part of the user’s information need. Modeling search engine’s explorations in dynamic search serve to help search engines explore in the information space, retrieve relevant documents and fulfill the user’s information need. Previous work employs topic modeling, such as Latent Dirichlet Allocation (LDA) to…
Dynamic search is an information retrieval task, in which information systems retrieve documents for a user’s multiple queries. Each query starts a search iteration and aims to fulfill part of the user’s information need. Modeling search engine’s explorations in dynamic search serve to help search engines explore in the information space, retrieve relevant documents and fulfill the user’s information need. Previous work employs topic modeling, such as Latent Dirichlet Allocation (LDA) to fulfill the user’s information need. In each iteration, the approach discovers potential topics of the user’s information need, and diversifies the search result by retrieving documents covering these topics. This thesis proposes to structure the user’s information need as an ontology (a topic hierarchy for knowledge representation) and to utilize topic transitions on the ontology to model search engine’s explorations in dynamic search. The ontology presents a clear landscape for search engine’s explorations and improves the effectiveness and efficiency of the user’s information seeking. The ontology can be obtained through extra resources, such as Wikipedia, or built on top of topic construction algorithms, such as nomothetic concept hierarchy construction method. In this thesis, we presume the ontology is presented to the search engine and focus on how the search engine efficiently achieves topic transitions on the ontology. Analogizing the search engine’s explorations on an ontology to a robot’s explorations in a world, we model the search engine’s explorations in dynamic search as a Reinforcement Learning (RL) problem and aim to learn a policy to optimize the topic transitions. We apply Multi- Armed Bandit (MAB) and Partially Observable Markov Decision Process (POMDP) to learn the search engine’s policy. We evaluate the model using the most recent Text REtrieval Conference Dynamic Domain track (TREC DD 2015) datasets. The result shows that our model is highly effective.
Courses
-
Advanced Algorithms
COSC-540
-
Advanced Database
COSC-580
-
Computer Hardware & System Architecture
COSC-520
-
Information Retrieval
COSC-488
-
Introduction to Data Analytics
ANLY-501
-
Machine Learning
COSC-575
-
Research Tutorial
COSC-901
-
Thesis
COSC-999
-
Topics in Computer Security
COSC-730
-
Web Search and Sense Making
COSC-589
Projects
-
Anicademy
- Cofounded Anicademy, a 3D AI character platform for manga and fiction creators, focusing on advanced tools like 3D modeling and voice creation.
- Led the technology strategy and development. Designed the memory system, integrated Text-to-Speech and Speech-to-Text models and used Elasticsearch as the vector database.
- Achieved an end-to-end communication time of under 1.5 seconds, outperforming 99% of competitors. -
StockX - Notification Platform
-
- Led the Notification Platform project at StockX, guiding a team of 4 engineers and a project manager.
- Designed and implemented a microservice and event-driven architecture, improving the platform's scalability.
- Collaborated effectively with over 10 teams, initiating the migration of 20+ notification types and planning for an
additional 100+, targeting the capacity to manage 10 million notifications daily.
- Streamlined development, cutting time for new notification…- Led the Notification Platform project at StockX, guiding a team of 4 engineers and a project manager.
- Designed and implemented a microservice and event-driven architecture, improving the platform's scalability.
- Collaborated effectively with over 10 teams, initiating the migration of 20+ notification types and planning for an
additional 100+, targeting the capacity to manage 10 million notifications daily.
- Streamlined development, cutting time for new notification integrations by 66%, from one month to 1.5 weeks. -
StockX - Product Feeds
-
- Headed the Product Feeds project, leading 2 engineers and a product manager to develop real-time product feeds.
- Achieved seamless integration of StockX's product pricing data with Google Shopping for instant updates, and implemented 1-day delayed pricing feeds to social media platforms like Facebook and Snapchat.
- Played a pivotal role in driving 38% of total website traffic to StockX using these product feeds. -
Google Cloud - Cloud SQL Backend Support
-
- Spearheaded the design and execution of a Cloud SQL Blackout Window for maintenance activities, ensuring minimal disruption to services.
- Designed and support minor version migrations for Cloud SQL database. On average, a Cloud SQL instance experiences less than 60 seconds of downtime during migration. -
Pure Storage - Time-Series Database (TSDB): CaerusDB
-
CaerusDB is part of Pure1 Monetization project, to support pay tier customers to save metrics data for 3 years, consisting of: Kairos serving online requests, Cassandra cluster for hot data, Kafka + workers to batch & aggregate data and a cold storage s3. CaerusDB takes traffic of 3.6 trillion data points per hour and support to save up to 10PB data.
- Designed the database schema and mechanism to split hot storage (Cassandra) and cold storage (s3).
- Designed and implemented the…CaerusDB is part of Pure1 Monetization project, to support pay tier customers to save metrics data for 3 years, consisting of: Kairos serving online requests, Cassandra cluster for hot data, Kafka + workers to batch & aggregate data and a cold storage s3. CaerusDB takes traffic of 3.6 trillion data points per hour and support to save up to 10PB data.
- Designed the database schema and mechanism to split hot storage (Cassandra) and cold storage (s3).
- Designed and implemented the metrics pipeline, using Kafka, to batch metrics received from REST call and save them in the cold storage, and to aggregate metrics based on requirements in REST call.
-
Pure Storage - Active Management
-
Active management aims to help customers to manage their On-Premise devices from cloud side. It has 4 components: a workflow engine, a Kafka cluster as message channel, a security component to ensure authorization and authentication, an agent in On-Premise device to execute tasks.
- Designed the active management flow from cloud to On-Premise devices.
- Designed and implemented workflow service on top of AWS Step Functions to allow customers to do active management for On-Premise…Active management aims to help customers to manage their On-Premise devices from cloud side. It has 4 components: a workflow engine, a Kafka cluster as message channel, a security component to ensure authorization and authentication, an agent in On-Premise device to execute tasks.
- Designed the active management flow from cloud to On-Premise devices.
- Designed and implemented workflow service on top of AWS Step Functions to allow customers to do active management for On-Premise devices, i.e., restoring a snapshot from NFS to FlashArray -
Pure Storage - Snapshot Catalog
-
Snapshot Catalog aims to provide a global view of snapshots for customers to help them manage data protection module
- Designed and built backend for supporting snapshot catalog in cloud, accounting for device local volume snapshots, protection group local/remote snapshots, to-NFS snapshots and non-pure snapshots.
-
Pure Storage - Single-Sign-On (SSO)
-
Integrated Pure1 login flow with Okta and Auth0 to provide single sign on so that customers can integrate it with their Active Directory Certificate Services and have smaller granularity of access control
-
Pure Storage - Backend Support for Flashblade
-
Honors & Awards
-
Computer Science Master Student Scholarship
Department of Computer Science, Georgetown University
Languages
-
English
Full professional proficiency
-
Chinese
Native or bilingual proficiency
More activity by Hongkai
-
Remote access leaves companies vulnerable to security threats such as phishing, application attacks, and malware. Stay ahead of the attackers: learn…
Remote access leaves companies vulnerable to security threats such as phishing, application attacks, and malware. Stay ahead of the attackers: learn…
Liked by Hongkai Wu
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore MoreOthers named Hongkai Wu in United States
-
Hongkai Wu
Student at Columbia University | MS in Mechanical Engineering | Concentration in Robotics and Control
-
Hongkai Wu
-
吴宏楷
--
-
吴鸿锴
暂无 - AI产品经理
4 others named Hongkai Wu in United States are on LinkedIn
See others named Hongkai Wu