Document classification has been a classic problem in both machine learning and information retrieval.One domain for document classification is automatic email routing.Given an email (a document), the system attempts to guess the location that the email should be routed to.An automatic system would in theory be able to replace a person doing the job of sorting emails, which can save time and money.However, incorrectly sorted emails would then need to be re-sorted manually, so it is important for the system to be accurate.The Engineering IT department at the University of Illinois at Urbana-Champaign has a helpdesk that users can email with technical problems.The IT department services the entire College of Engineering, encompassing many departments, so they need a way to sort emails based on which IT professional would be best suited to fixing a user’s problem.Thus, this problem would be well suited for some type of document classification solution.However, the helpdesk stands out from similar problems; the IT department already employs an email routing technique that is already quite accurate.It becomes very obvious that a stand-alone document classification solution would be subpar, but perhaps combining it with the existing routing method would provide higher accuracy.In this thesis, we explore ways to combine the classic document classification techniques with the existing routing strategy used by the helpdesk.We test out different text-based features, but we find that since the existing method is already very accurate, it is quite difficult to improve on.
【 预 览 】
附件列表
Files
Size
Format
View
A study of automatic email routing for an information technology help desk