【所屬類別】學院通知     【作者】管理員     【閱讀次數】     【發布時間】2019-01-04 09:46:57  




題目 Linguistic Information: Concepts, Structures, and Quantitative Assessments

-- A Linguistic and Cognitive Approach to Natural Language Processing and Machine Learning

摘要: In this talk, I will discuss a linguistic and cognitive approach to solving certain fundamental issues in the fields of natural language processing and machine-learning with textual data. Specifically, in contrast to the currently prevailing methods that are mainly based on statistical approaches or mathematical operations on data as symbols, I will address the informational nature of textual contents as data that carry meaning and information.

    NLP as a subfield of Artificial Intelligence has had a long history with many attempts to make significant progress. However, the fundamental issues in the field have proven to be much more complicated than people originally expected. Over the years, mainstream NLP approaches gradually settled on using statistical methods for text classification and identifying information in “unstructured data”, after hand-written heuristics and rule-based systems failed to scale up to be generally applicable. But the advancements have not been significant due to the extremely challenging tasks in understanding natural languages.

    In this talk, I will present the major parts of a theoretical framework and implementation methods that I proposed for representing the relationships between language, knowledge, and information. The topics covered will include the concepts, structures, and quantitative measurements of what I call “linguistic information”. I will compare this framework with the modern information theory based on Claude Shannon’s ground-breaking work, and further extend Shannon’s basic concepts to natural language data and the process of linguistic communications.

   Demos will also be provided to show that more intelligent technologies and products can be built using the linguistic and cognitive approaches, in addition to the statistical approaches.

個人簡介 Dr. Ronald (Guangsheng) Zhang conducted his first graduate study at Shanghai Jiao Tong University with a major in Applied Linguistic and Foreign Languages for Science and Technology. After that he was retained as an assistant professor at the same university. After coming to the United States, he earned his PhD in Linguistics from the University of Delaware. During these times, he developed deep insights into how natural languages encode and carry information, and how human brains comprehend language,together with a keen interest in building natural language-based technologies and products for solving practical problems, and for verifying theoretical hypoheses.

    In recent years, Dr. Zhang worked as the CTO and Chief Scientist at Linfo Research in Silicon Valley California, USA, and the Chief Scientist at Bello Intelligent Technologies Co. in Shenzhen, China. Dr. Zhang is the inventor of 32 issued patents, covering new technologies ranging from topic-modeling, intelligent search engines, information extraction from unstructured data, rule-based methods for high-accuracy sentiment analysis, unsupervised machine-learning methods for knowledge-discovery and knowledge representation, and information management methods and user interface functionalities for large amounts of unstructured data. Some of these patents are also being turned into academic papers.

作者:計算機科學與技術學院   審核:劉學軍