Ph.D. Candidate
Data Lab
Department of Computer Science
Email: jinli-at-cs.pdx.edu (replace "-at-" by @ to send e-mail)
About me
I am a Ph.D. student in the Computer Science
Department, at Portland State
University. My advisor is David
Maier. During the summer of 2007 I was also an
intern at AT&T Research.
My research focuses on high-volume data stream
processing. My work includes modeling
data streams, efficient implementations of stream query operators and a new
architecture for scalable, high-performance stream processing systems. More
broadly, I am interested in designing and building data-intensive systems
I expect to graduate in 2008 and I am currently looking for a position at a
research lab or with advanced product development. Here is my CV.
Education
Ph.D. Computer Science,
Advisor: David Maier
Dissertation: Windowed Queries over Data Streams
M.S. in Computer Science and
Engineering,
B.S. in Computer Science and
Engineering, Xi’an Institute of Technology,
Research
My primary area of research
interest is data stream processing. The following are the projects I have
worked on:
Research Assistant, 09/2002 – present
NiagaraST: NiagaraST is a stream query engine started/extended from the Niagara Internet Query system. In NiagaraST, progress of inter-operator streams is indicated by punctuation. Stream query operators are punctuation-aware and do not have to rely on physical stream properties, such as stream arrival order, stream arrival delay and time skew of stream sources, to progress. Introducing punctuation to explicitly communicate stream progress improves the efficiency, flexibility and scalability of stream processing.
Latte: Latte is extended from NiagaraST, and supports integrated stream-archive processing. The project is motivated by our observation that advanced stream processing often require integrating or comparing streaming data with relevant archived historical data. It aims to retrieve “related” archive data in real-time.
AT&T Shannon Research Lab,
Summer Intern, 07/2007 – 09/2007
Enforcing stream order is a common way used in stream systems to communicate progress of inter-operator streams. However, in high-volume stream processing, such as network traffic monitoring, enforcing stream order can incur significant performance overhead, especially in terms of throughput and memory. Using punctuation to explicitly communicate stream progress can help eliminating the need of enforcing order and thus the associated performance overhead. I extended Gigascope, an operational network packet monitoring system, to support using punctuation for inter-operator stream progress. The experimental result shows significant performance improvements, especially in memory and throughput.
Industry
experience
National Instruments
Software Engineer, 04/2000 – 08/2002
Selected publications
Jin Li, Kristin Tufte, Vladislav Shkapenyuk, Vassilis Papadimos, Theodore Johnson, David Maier. Out-of-Order Processing: A New Architecture for High-Performance Stream Systems. To appear in VLDB 2008.
Kristin Tufte, Jin Li, David Maier, Vassilis Papadimos, Robert Bertini. Travel Time Estimation Using NiagaraST and Latte. In the 26th ACM SIGMOD International Conference on Management of Data (Demo), June, 2007.
Jin Li,
David Maier, Kristin Tufte, Vassilis Papadimos, and Peter A. Tucker. Semantics
and Evaluation Techniques for Window Aggregates in Data Streams. In Proc. of
the 24th ACM International Conference on Management of Data (SIGMOD),
June 2005.
Jin Li, David Maier, Kristin Tufte, Vassilis Papadimos, and Peter A. Tucker. No Pane, No Gain: Efficient Evaluation of Sliding-Window Aggregates over Data Streams. In SIGMOD Record, 34(1), March 2005.
David Maier, Jin Li, Peter Tucker, Kristin Tufte, and Vassilis Papadimos. Semantics of Data Streams and Operators. In Proc. of the 10th of International Conference on Database Theory, January, 2005.
“The good life is
one inspired by love and guided by knowledge.” – Bertrand Russell