printlogo
http://www.ethz.ch/index_EN
Welcome to the Databases and Information Systems Group
 
print
  

Enterprise Application Search

Dynamic web applications are a big part of the web but they are not indexed correctly by most search engines. Enterprise search engines face the same problem: JSP or PHP pages are not correctly understood by the engine. On the client side, Macromedia(Adobe) Flash websites, JavaScript-based applications, and, more recently, AJAX applications (Google maps, Google Scholar), allow the creation of the new wave of web applications. We want to index both server-side applications (JSP) and client-side (AJAX) applications.

Theses:

Available Thesis 1. Indexing JSP applications

Indexing the server-side of the applications is a problem left unsolved by current search engines, even enteprise search engines. An enterprise search engine must look at the raw data(JSP pages, relational databases, XML), index it, and return results in term of the user view. Curent search engines do not do this since they do not understand how the application assembles all data together in order to generate the user view.

Goals:

1) Study the patterns in JSP applications. Consider Java Beans and Web Application Frameworks (e.g., Struts)
2) Automate indexing of JSP pages by transforming them in an intermediate XML format.
3) Plug-in search functionality by extending the taglibs with "search" hooks (which allow to build the index on the fly).
4) Consider workflows and privacy issues in JSP data and encode them in the index.
This thesis has implications in web software engineering.

appdata_ALL_01_2007_50
Theses openings January 2007.

Available Thesis 2. Indexing AJAX Applications

Client-side dynamic pages pose big challenges for existing search engines: moreover, this type of applications is almost ignored by the search engines. The reason is that more and more applications are one-page-applications : all logic resides on the client side - search engines do not understand the application logic. This thesis attempts to find a solution to this issue.

Goals:

1) Study the patterns of AJAX applications, Flash applications, and RIA (Rich Internet Applications): server-based parts, client-based parts, the structure of the web page (DOM), static and dynamic parts. Decompose a one-page site into several parts.
2) Extract all relevant information (text, data, and unparsable parts), in an XML format.
3) Adapt our existing predicate-based indexing framework [3] to indexing the client-side web in a given application scenario.
4) Consider Data Structures(indexes) which are flexible enough to accomodate new dimensions to the index, on demand. For example, a dynamic JavaScript application might create several types of DOM trees for the same page - so be able to efficiently index both trees.
5) Dynamicly create and maintain indexes for Web Applications, by monitoring user behaviour.

References

[1] Cristian Duda, David Graf, Donald Kossmann Predicate-based indexing of Enterprise Web Applications, Demo Paper, CIDR 2007
[2] Jens Dittrich, Cristian Duda, Björn Jarisch, Donald Kossmann, Marcos Vaz Salles Bringing Precision to Desktop Search: A Predicate-based Desktop Search Architecture, Poster Paper, ICDE 2007
[3] Jens Dittrich, Cristian Duda, Björn Jarisch, Donald Kossmann, Marcos Vaz Salles Keyword Search in Application Data, Technical Report, ETH Zurich, 2007
[4] Patterns in AJAX Web Applications

Published Theses:

 

Wichtiger Hinweis:
Diese Website wird in älteren Versionen von Netscape ohne graphische Elemente dargestellt. Die Funktionalität der Website ist aber trotzdem gewährleistet. Wenn Sie diese Website regelmässig benutzen, empfehlen wir Ihnen, auf Ihrem Computer einen aktuellen Browser zu installieren. Weitere Informationen finden Sie auf
folgender Seite.

Important Note:
The content in this site is accessible to any browser or Internet device, however, some graphics will display correctly only in the newer versions of Netscape. To get the most out of our site we suggest you upgrade to a newer browser.
More information

© 2012 ETH Zurich | Imprint | Disclaimer | 13 March 2007
top