About
I am an accomplished technology executive with deep expertise in engineering…
Articles by Chu-Cheng
Activity
-
Welcome Manju Rajashekhar! ⭐ Our newest EIR is a seasoned technology leader with a passion for entrepreneurship and building high-performing…
Welcome Manju Rajashekhar! ⭐ Our newest EIR is a seasoned technology leader with a passion for entrepreneurship and building high-performing…
Liked by Chu-Cheng Hsieh
-
You've no doubt heard of WeWork and know about it's valuation plunge from $47B to $8B in 2019, caused by its aggressive growth strategy which had it…
You've no doubt heard of WeWork and know about it's valuation plunge from $47B to $8B in 2019, caused by its aggressive growth strategy which had it…
Liked by Chu-Cheng Hsieh
-
The SHEIN team had a productive week at the 2025 World Economic Forum Annual Meeting in Davos, where we had insightful discussions with global…
The SHEIN team had a productive week at the 2025 World Economic Forum Annual Meeting in Davos, where we had insightful discussions with global…
Liked by Chu-Cheng Hsieh
Experience
Education
Publications
-
Automatic Speaker Recognition with Limited Data
the 13th ACM International WSDM Conference (WSDM’20)
Automatic speaker recognition (ASR) is a stepping-stone technology towards semantic multimedia understanding and benefits versatile downstream applications. In recent years, neural network-based ASR methods have demonstrated remarkable power to achieve excellent recognition performance with sufficient training data. However, it is impractical to collect sufficient training data for every user, especially for fresh users. Therefore, a large portion of users usually has a very limited number of…
Automatic speaker recognition (ASR) is a stepping-stone technology towards semantic multimedia understanding and benefits versatile downstream applications. In recent years, neural network-based ASR methods have demonstrated remarkable power to achieve excellent recognition performance with sufficient training data. However, it is impractical to collect sufficient training data for every user, especially for fresh users. Therefore, a large portion of users usually has a very limited number of training instances. As a consequence, the lack of training data prevents ASR systems from accurately learning users acoustic biometrics, jeopardizes the downstream applications, and eventually impairs user experience.
In this work, we propose an adversarial few-shot learning-based speaker identification framework (AFEASI) to develop robust speaker identification models with only a limited number of training instances. We first employ metric learning-based few-shot learning to learn speaker acoustic representations, where the limited instances are comprehensively utilized to improve the identification performance. In addition, adversarial learning is applied to further enhance the generalization and robustness for speaker identification with adversarial examples. Experiments conducted on a publicly available large-scale dataset demonstrate that \model significantly outperforms eleven baseline methods. An in-depth analysis further indicates both effectiveness and robustness of the proposed method.Other authorsSee publication -
Isa: Intuit Smart Agent, A Neural-Based Agent-Assist Chatbot (to appear)
IEEE International Conference on Data Mining (ICDM'18)
Hiring seasonal workers in call centers to provide customer service is a common practice in B2C companies. The quality of service delivered by both contracting and employee customer service agents depends heavily on the domain knowledge available to them. When observing the internal group messaging channels used by agents, we found that similar questions are often asked repetitively by different agents, especially from less experienced ones. The goal of our work is to leverage the promising…
Hiring seasonal workers in call centers to provide customer service is a common practice in B2C companies. The quality of service delivered by both contracting and employee customer service agents depends heavily on the domain knowledge available to them. When observing the internal group messaging channels used by agents, we found that similar questions are often asked repetitively by different agents, especially from less experienced ones. The goal of our work is to leverage the promising advances in conversational AI to provide a chatbot-like mechanism for assisting agents in promptly resolving a customer’s issue. In this paper, we develop a neural-based conversational solution that employs BiLSTM with attention mechanism and demonstrates how our system boosts the effectiveness of customer support agents. In addition, we discuss the design principles and the necessary considerations for our system. We then demonstrate how our system, named Isa (Intuit Smart Agent), can help customer service agents provide a high-quality customer experience by reducing customer wait time and by applying the knowledge accumulated from customer interactions in future applications.
Other authors -
Monetary Discount Strategies for Real-Time Promotion Campaign
The 26th World Wide Web conference (WWW' 17)
The effectiveness of monetary promotions has been well reported in the literature to affect shopping decisions for utilitarian products in real life experience [3]. Nowadays, e-commerce retailers are facing more fierce competition on price promotion in that online consumers can easily hunt for the best products with the highest value at a reasonable price. We study e-commerce data – shopping
receipts collected from email accounts, and conclude that for utilitarian products like books or…The effectiveness of monetary promotions has been well reported in the literature to affect shopping decisions for utilitarian products in real life experience [3]. Nowadays, e-commerce retailers are facing more fierce competition on price promotion in that online consumers can easily hunt for the best products with the highest value at a reasonable price. We study e-commerce data – shopping
receipts collected from email accounts, and conclude that for utilitarian products like books or electronics, buyers are price sensitive and are willing to delay the purchase for better deals. We then present a real-time promotion framework, called the RTP framework: a one-time promoted discount price is offered to allure a potential buyer making a decision promptly. To achieve more ef-
fectiveness on real-time promotion in pursuit of better profits, we propose two discount-giving strategies: an algorithm based on Kernel density estimation, and the other algorithm based on Thompson sampling strategy. We show that, given a pre-determined discount budget, our algorithms can significantly acquire better revenue in return than classical strategies with simply fixed discount on label price, demonstrating its feasibility to be a promising deployment in e-commerce services for real-time promotion.Other authorsSee publication -
Efficient Approximate Thompson Sampling for Search Query Recommendation
The 30th ACM/SIGAPP Symposium On Applied Computing (SAC'15)
Query suggestions have been a valuable feature for e-commerce sites in helping shoppers refine their search intent. In this paper, we develop an algorithm that helps e-commerce sites like eBay mingle the output of different recommendation algorithms. Our algorithm is based on “Thompson Sampling” — a technique designed for solving multi-arm bandit problems where the best results are not known in advance but instead are tried out to gather feedback. Our approach is to treat query suggestions as a…
Query suggestions have been a valuable feature for e-commerce sites in helping shoppers refine their search intent. In this paper, we develop an algorithm that helps e-commerce sites like eBay mingle the output of different recommendation algorithms. Our algorithm is based on “Thompson Sampling” — a technique designed for solving multi-arm bandit problems where the best results are not known in advance but instead are tried out to gather feedback. Our approach is to treat query suggestions as a competition among data resources: we have many query suggestion candidates competing for limited space on the search results page. An “arm” is played when a query suggestion candidate is chosen for display, and our goal is to maximize the expected reward (user clicks on a suggestion). Our experiments have shown promising results in using the click-based user feedback to drive success by enhancing the quality of query suggestions.
Other authorsSee publication -
A short-term bookmarking system for collecting user-interest data
PAKDD Workshop on Big Data Science and Engineering on E-Commerce
During the shopping process, users typically narrow down their search to a small collection of products before making a final purchase. These data, consisting of products that users are considering purchasing, correlate strongly with user search intent and product desirability. By allowing users to bookmark products between browsing and purchasing, we collect user-interest information. We then propose a product recommendation algorithm based on these data. By considering both popular and…
During the shopping process, users typically narrow down their search to a small collection of products before making a final purchase. These data, consisting of products that users are considering purchasing, correlate strongly with user search intent and product desirability. By allowing users to bookmark products between browsing and purchasing, we collect user-interest information. We then propose a product recommendation algorithm based on these data. By considering both popular and long-tail queries, we shed light on the potential usage of the data.
Other authorsSee publication -
Incorporating Popularity in Topic Models for Social Network Analysis
the 36th Annual ACM Special Interest Group on Information Retrieval (SIGIR)
In this paper, we propose topic models to deal with social network data. Our topic models are specialized in dealing with \popularity bias" caused by dominance of a limited number of popular user (or node) in a dataset. These popular nodes have been simply removed in topic models because they do not have much meaning (e.g., the and is). However, in a social network dataset, most people are interested in popular users (e.g., Barack Obama and Britney Spears) and they should be carefully…
In this paper, we propose topic models to deal with social network data. Our topic models are specialized in dealing with \popularity bias" caused by dominance of a limited number of popular user (or node) in a dataset. These popular nodes have been simply removed in topic models because they do not have much meaning (e.g., the and is). However, in a social network dataset, most people are interested in popular users (e.g., Barack Obama and Britney Spears) and they should be carefully handled.
To solve this problem, we introduce a notion of "popularity component" and explore various ways to e?ectively incorporate it. Through extensive experiments, we show that our proposed models achieve signi?cant improvements over the existing models in terms of lowering "perplexity". We
also show that the outgoing edge degree (how many people a user follows) does not help much in achieving the lower perplexity. Our models can be useful in providing more accurate recommendations and clusterings for various services including social network services.Other authorsSee publication -
Finding similar items by leveraging social tag clouds
ACM Symposium on Applied Computing (SAC)
Recently social collaboration projects such as Wikipedia and Flickr have been gaining popularity, and more and more social tag information is being accumulated. In this study, we demonstrate how to effectively use social tags created by humans to find similar items. We create a query-by-example interface for finding similar items through offering examples as a query. Our work aims to measure the similarity between a query, expressed as a group of items, and another item through utilizing the…
Recently social collaboration projects such as Wikipedia and Flickr have been gaining popularity, and more and more social tag information is being accumulated. In this study, we demonstrate how to effectively use social tags created by humans to find similar items. We create a query-by-example interface for finding similar items through offering examples as a query. Our work aims to measure the similarity between a query, expressed as a group of items, and another item through utilizing the tag information. We show that using human-generated tags to find similar items has at least two major challenges: popularity bias and the missing tag effect. We propose several approaches to overcome the challenges. We build a prototype website allowing users to search over all entries in Wikipedia based on tag information, and then collect 600 valid questionnaires from 69 students to create a benchmark for evaluating our algorithms based on user satisfaction. Our results show that the presented techniques are promising and surpass the leading commercial product, Google Sets, in terms of user satisfaction.
Other authors -
-
Detecting Unknown Malicious Executables Using Portable Executable Headers
Fifth International Joint Conference on INC, IMS and IDC
Even though numerous kinds of anti-virus software packages have been used for many years, previously unseen malware is still a serious threat to computer and information system. By analyzing portable executable header entries of executables, a malware detection model which consists of four stages: attribute extraction, attribute binarization, attribute elimination, and feature selection and classifier training was carried out in this study. First, we collected header entries from all…
Even though numerous kinds of anti-virus software packages have been used for many years, previously unseen malware is still a serious threat to computer and information system. By analyzing portable executable header entries of executables, a malware detection model which consists of four stages: attribute extraction, attribute binarization, attribute elimination, and feature selection and classifier training was carried out in this study. First, we collected header entries from all executables in our dataset and viewed each entry as a potential attribute. Second, information gain and gain ratio were used to binarize numerical and nominal attributes. Next, useless and redundant attributes were eliminated in the third stage. Finally, by using support vector machine which is a classification algorithm of conspicuous generalization ability, feature selection was simultaneously performed with classifier training to reduce the number of attributes and retain the performance of classifier in a cost-effective. We evaluated our model by 1,908 benign programs and 7,863 malicious files (virus, email worm, trojan and backdoor) and estimated its generalization ability by cross validation. The experiment results showed that our model had promising performance for detecting virus and email worm.
Other authorsSee publication -
A Virus Prevention Model Based on Static Analysis and Data Mining Methods
the IEEE 8th International Conference on Computer and Information Technology
Owing to the lack of prevention ability of traditional anti-virus methods, a behavior-based virus prevention model for detecting unknown virus is proposed in this study. We first defined the behaviors of an executable by observing its usage of dynamically linked libraries and Application Programming Interfaces. Then, information gain and support vector machines were applied to filter out the redundant behavior attributes and select informative feature for training a virus classifier. The…
Owing to the lack of prevention ability of traditional anti-virus methods, a behavior-based virus prevention model for detecting unknown virus is proposed in this study. We first defined the behaviors of an executable by observing its usage of dynamically linked libraries and Application Programming Interfaces. Then, information gain and support vector machines were applied to filter out the redundant behavior attributes and select informative feature for training a virus classifier. The performance of our model was evaluated by a dataset contains 1,758 benign executables and 846 viruses. The experiment results are promising, and the overall accuracies are 99% and 96.66% for detecting the known viruses and the previously unseen viruses respectively.
Other authorsSee publication -
Experts vs The Crowd: Examining Popular News Prediction Perfomance on Twitter
-
In the finance domain, the famous Efficient Market Hypothesis(EMH) concludes that crowd wisdom is superior to any expert wisdom in picking financial stocks. In this study, we test a similar hypothesis in the domain of news recommendation by conducting experiments on Twitter. We first identify a group of experts on Twitter who have consistently identified ``interesting'' (or popular) news early on and have recommended them in their tweets. We then collect two sets of news: a set of incoming news…
In the finance domain, the famous Efficient Market Hypothesis(EMH) concludes that crowd wisdom is superior to any expert wisdom in picking financial stocks. In this study, we test a similar hypothesis in the domain of news recommendation by conducting experiments on Twitter. We first identify a group of experts on Twitter who have consistently identified ``interesting'' (or popular) news early on and have recommended them in their tweets. We then collect two sets of news: a set of incoming news recommended by these experts and a similar set recommended by the ``crowd''. We then observe, for a few months, how widely the news in the two sets are circulated on Twitter, and evaluate which set contains more widely-circulated news (and therefore are more likely to be interesting).After conducting repeated experiments, we draw a similar conclusion to the EMH -- the crowd wisdom is always the winner in our experiments; we could not identify an expert group whose news recommendation performance was consistently better than that of the crowd. We then proceed to investigate whether the expert wisdom can be used to improve crowd wisdom in any way.
Other authorsSee publication
Courses
-
Artificial Intelligence
CS161
-
Automatically Reasonaing Theory and Application
CS264A
-
Connectionist Language Processing
CS263B
-
Data and Knowledge Base System
CS240A
-
Database
CS143
-
Distributed Database System
CS244A
-
Intelligent Information System
CS245A
-
Web Applications
CS144
-
Web Information System
CS246
Projects
-
Finding similar items by leveraging social tag clouds
-
I create a query-by-example interface for finding similar items through offering examples as a query. My work aims to measure the similarity between a query, expressed as a group of items, and another item through utilizing the tag information.
Honors & Awards
-
Best Paper Runner Up
SIGIR 2013
Paper (with Youngchul Cha, Bin Bi, and Junghoo “John” Cho), “Incorporating popularity in topic models for social network analysis”, in Proceedings of the 36th Annual ACM Special Interest Group on Information Retrieval (SIGIR), May, 2013.
-
Conference Travel Award
SAC'12
Paper (with Junghoo ”John” Cho), “Finding similar items by leveraging social tag clouds”, in Pro- ceedings of the Annual ACM Symposium on Applied Computing (SAC), March 26-30, 2012. Student Travel Award.
Languages
-
English
-
-
Chinese
-
Recommendations received
27 people have recommended Chu-Cheng
Join now to viewMore activity by Chu-Cheng
-
As software development grew in the 70s & 80s, it adopted management styles from manufacturing with little change. This gave rise to the Waterfall…
As software development grew in the 70s & 80s, it adopted management styles from manufacturing with little change. This gave rise to the Waterfall…
Liked by Chu-Cheng Hsieh
-
One of the best videos you’ll ever see about Large Language Models. Probably no the best fancy slides or visuals and the sound quality is not…
One of the best videos you’ll ever see about Large Language Models. Probably no the best fancy slides or visuals and the sound quality is not…
Liked by Chu-Cheng Hsieh
-
AI Revolution: Shaping the Future of Work I had an incredible experience delivering the opening keynote at the North America Emerging Technology…
AI Revolution: Shaping the Future of Work I had an incredible experience delivering the opening keynote at the North America Emerging Technology…
Liked by Chu-Cheng Hsieh
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore More