CSCE 482: 2012

Thursday, April 5, 2012

Article Review - Mixed Reality Taxonomies

Howdy!

Today, I looked at a research paper from 1994 entitled A Taxonomy of Mixed Reality Visual Displays. This research was one of the earliest attempts at trying to classify the newly developing (at the time) field of augmented reality. The beginning pages of this (very long) paper focus on a continuum of "virtuality", in which the researchers distinguish between a reality that is simply augmented with virtual objects (augmented reality) and a virtual world that is supplemented with the real world (augmented virtuality), with the two extremes being the physical world and a purely virtual world as outlined by the image below.

(image from the paper)

The paper then goes on at some length about the variations of augmented reality and augmented virtuality, comparing graphics-based content and video-based content, talking about technical limitations in reproduction fidelity, and some other topics. The most interesting takeaway from all of that discussion is just seeing how far the field has come in eighteen years of research and development. One topic that is still relevant to today's AR research is the idea of EWK, or Extent of World Knowledge. Some systems will need to know exactly what the world they're working with is like, so that the virtual content can interact correctly with the real world, while some systems (such as Group Graffiti!) have no need for that information. The continuum for EWK, as decided on by the researchers, is shown below.

(image from the paper)

I feel like this paper, while not really exploring new ground as far as research and development, was an essential first step for the field of augmented reality. It's also a fitting end to my blog posts about this topic, as it looks back on two action-packed decades in a comforting and nostalgic light. The end.

Thanks and gig 'em!

Kishino, Fumio; Milgram, Paul. 1994. "A Taxonomy of Mixed Reality Visual Displays". IEICE Transactions on Information Systems. Toronto, Ontario, Canada.
URL: http://www.eecs.ucf.edu/~cwingrav/teaching/ids6713_sprg2010/assets/Milgram_IEICE_1994.pdf

Tuesday, March 27, 2012

Article Review - A Novel Method for QR Security

Howdy!

Today, I looked at a research paper from 2010 entitled A Novel Secret Sharing Technique Using QR Code. The researchers appreciated the value of widespread bar-code usage in every industry from retail to manufacturing because of how easy it is to encode data, scan and decode data, and produce the codes themselves. The aspect they wanted to improve was security, because as it currently stands, any device with a bar-code scanner (i.e. most commercial phones nowadays) can decode a standard bar-code within seconds. Their response to this was a framework built around using multiple QR codes with a built-in security threshold, but with all the necessary data for their algorithm in the codes themselves (removing the need for an internet connection). The framework can be seen in the image below.

(image from the paper)

Essentially, the algorithm boils down to group of n QR codes all containing related data and requiring at least t of them to obtain the key, a0. Once the key is obtained, any single QR code can then be decoded. According to the research team, the big benefit of this method is the lack of a remote database connection being needed. All the data required is in the codes themselves.

This article seemed interesting to me largely because we are focusing on QR codes for context-awareness and were thinking about possibly storing other meta-data in the codes as well. Currently, we're not considering any sensitive data for this task, but it's nice to know that we could if we wanted to.

Thanks and gig 'em!

Chuang, Jun-Chou; Hu, Yu-Chen; Ko, Hsien-Ju . 2010. A Novel Secret Sharing Technique Using QR Code. International Journal of Image Processing. CSC Journals, Kuala Lumpur, Malaysia. pp. 468-475.
URL: http://www.cscjournals.org/csc/manuscriptinfo.php?ManuscriptCode=69.70.69.76.41.46.50.47.102

Monday, March 26, 2012

Article Review - Personalized Mobile Sightseeing Tours

Howdy!

Today, I looked at a research paper from 2010 entitled Personalized Sightseeing Tours SupportUsing Mobile Devices. These researchers described their plans to migrate a sightseeing desktop application to Android devices for users on the go. It is similar to some other articles I've read on these types of applications in that they want it to be a context-aware system, use some augmented reality for ease of use, and draw on the user's profile settings, location, and other filters like collaborative ratings. The basic architecture they've planned is shown below.

(image from the paper)

The unique thing about this article is the researchers' description of their system's occasionally-connected behavior. They recognized the limitations of mobile solutions: the current lack of 100% internet availability, the increased battery use of an active internet connection, and the lower (on average) speed of mobile internet. Based on the user's context variables, particularly location, the system will grab a chunk of relevant data and store it locally on the phone using SQLite. Through periodic polling, the system will grab new data as it needs, which should not be too often. Using this method, the user can change their sightseeing preferences and get results in real-time, even without an active internet connection.

For our application, we will probably be needing to store some content locally, deleting it as necessary. The approach taken by these researchers is a good stepping stone toward that.

Thanks and gig 'em!

Anacleto, Ricardo; Figueiredo, Lino; Luz, Nuno. 2010. Personalized Sightseeing Tours Support Using Mobile Devices. Human-Computer Interaction - IFIP Advances in Information and Communication Technology. Springer Boston, pp. 301-304.
doi: 10.1007/978-3-642-15231-3_35
URL: http://dx.doi.org/10.1007/978-3-642-15231-3_35

Wednesday, March 21, 2012

Article Review - Large-Scale Smartphone User Studies

Howdy!

Today, I looked at a research paper from 2010 entitled The Challenges in Large-Scale Smartphone User Studies. These researchers recognized the pervasive presence of the smartphone in the recent decade, and they wanted to look at the use habits of Blackberry users. The application was a simple logger (pictured below) that would maintain data based off two hardware cues: the LCD screen being activated and the OS idle flag being set.

(image from the paper)

Two main statistics were taken, with the results being shown in the below graph. The total length of interaction time per day ended up with an average of almost 1.7 hours, and the average number of sessions per day was around 87. Both values were skewed by high outliers, and (according to the researchers) also affected by malicious users and resource-hogging apps.

(image from the paper)

Thanks and gig 'em!

Earl Oliver. 2010. The challenges in large-scale smartphone user studies. In Proceedings of the 2nd ACM International Workshop on Hot Topics in Planet-scale Measurement (HotPlanet '10). ACM, New York, NY, USA, , Article 5 , 5 pages.
doi: 10.1145/1834616.1834623
URL: http://doi.acm.org/10.1145/1834616.1834623

Article Review - Sentient Graffiti

Howdy!

Today, I looked at a research paper from 2007 entitled A context-aware mobile mash-up platform for ubiquitous web. As the title indicates, these researchers sought to create a mobile application platform that supported context-aware features in what is known as the Ubiquitous Web. Context-awareness was achieved by a combination of features, including precise location (GPS), proximity (Bluetooth), an object's physical features, and user-generated annotations (i.e. keywords/phrases). Ubiquitous Web (or UW) is an idea that enhances the context-awareness; it involves linking physical objects to a virtual resource via URIs and then providing information and services that enrich the user experience.

The specific platform they envisioned and implemented was known as Sentient Graffiti (or SG). The flow of information in this platform is described in the image below. Essentially, users can divide the functional blocks of SG into two layers. Graffiti annotation involves the placement of content using the context-aware properties mentioned previously, and graffiti discovery/consumption is the browsing of those pieces of content.

(image from the paper)

The researchers insisted that SG is a platform and not an application because of its ability to be used to create other context-aware applications. They claim that SG addresses three critical functional requirements for these types of apps: the modeling of every physical object/region, the filtering of annotations by context, and the pool of HTTP push/pull commands for user interaction. The paper closed with a few examples of how SG could be used, one of which is pictured below.

(image from the paper)

This paper is exciting for several reasons. First off, we're finding more and more interest in the idea of virtual graffiti, which is great. On the flip side, there are still many improvements to be made on all of these previous attempts. Additionally, this research has shown that consumer-grade mobile devices are very capable of running context-aware server/client applications.

Thanks and gig 'em!

Lopez-de-Ipia, D.; Vazquez, J.I.; Abaitua, J.; , "A context-aware mobile mash-up platform for ubiquitous web," Intelligent Environments, 2007. IE 07. 3rd IET International Conference on , vol., no., pp.116-123, 24-25 Sept. 2007
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4449920&isnumber=4449897

Thursday, March 8, 2012

Article Review - Augmented Reality 2.0

Howdy!

Today, I looked at a research paper from 2011 entitled Augmented Reality 2.0. The team behind this paper took a fresh look at AR by comparing it to another popular buzz word - Web 2.0. As they described it, the move to Web 2.0 involved a shift toward massive, simultaneously connected, and user-driven content. In addition, there was a paradigm shift to mask the differences between different sets of data - that is, local data should be indistinguishable from roaming data, and different content sources should be able to be mashed together into new, dynamically generated content. In the researchers' opinions, this should be the goal of augmented reality as well (see figure).

(image from the paper)

The researchers then went into the components that they thought were most important when developing AR content-generators. The first, application data, would essentially be a transference of the way Web 2.0 handles its data - through the AJAX model of asynchronous, multi-threaded Javascript execution. The next component, real-world representation, is being solved right now through a variety of methods, including planar surfaces, geo-location data, and even pure camera-based data. The third (and biggest) challenge is the actual development of client-side AR applications. The team identified several authoring tools that could be used, including some for non-programmers (see figure). It's worth noting that the biggest challenge here is still the development of mobile AR applications by non-programmers, but the researchers believed that the wide array of easy-to-understand Python scripts could be combined to make it work.

(image from the paper)

After describing a few case studies that were beginning to make use of these components, the paper concluded with some brief miscellaneous thoughts. First, they recognized a need for more research in the area of outdoor localization, believing GPS to be too inaccurate to fully realize AR 2.0 and recommending computer vision techniques now that mobile phones have become much more powerful. Next, they looked at potential development areas, seeing that potential in just about every area, from personal exploration to urban planning to education. Finally, they briefly discussed the need for better user evaluation techniques, stating that current methods are too small and too qualitative.

Overall, I feel like this article gave me more confidence in the importance of what my project group is working on. Group Graffiti seeks to help answer many of the concerns of "AR 1.0", and with applications like ours continually being developed, AR 2.0 could be well on its way.

Thanks and gig 'em!

Billinghurst, Mark; Langlotz, Tobias; Schmalstieg, Dieter; "Augmented Reality 2.0," Virtual Realities, pp.13-37. 2011.
doi: 10.1007/978-3-211-99178-7_2
Url: http://dx.doi.org/10.1007/978-3-211-99178-7_2

Monday, February 20, 2012

Article Review - User Studies for Touch-Screen Interfaces

Howdy!

Today, I looked a research paper from 2009 entitled Evaluation of User Interface Design and Input Methods for Applications on Mobile Touch Screen Devices. In a deviation from the augmented reality studies, this paper performed a user study to try and determine the "best" interface design for mobile touch-screen devices. They commented that previous studies had given somewhat vague guidelines on what a good interface "should" be, but they wanted to compare very specific UI techniques and gain empirical data.

(image from the paper)

To do this, they looked at three different aspects of mobile application interfaces. First, they compared a scrollable layout to a tabbed layout by having users navigate to different pages. An interesting finding here was that even though the scrollable layout had faster results, a qualitative questionnaire revealed that users seemed to prefer the aesthetics of the tabbed layout. Next, they had users edit their profiles with modal dialogs (i.e. date picker) versus non-modal input methods (directly entering with the keyboard). For this task, the clear winner in both timing and preference was the non-modal form of entry, probably because users were more accustomed to the standard QWERTY keyboard; this might yield different results with today's users. Finally, they had users navigate a list and use a menu to perform various operations on each element, once with a context menu (long press) and once with the physical menu button. In this case, the physical button won out in both timing and usability simply because the users felt the context press was not very intuitive. Even by today's menu standards, I tend to agree with these results.

The biggest drawback of this usability study was the small experimental group, although it did help that the researches had a variety of ages and experience levels within that small group. Usability is something often overlooked, and so the more research efforts like these, the better.

Thanks and gig 'em!

Balagtas-Fernandez, Florence; Forrai, Jenny; Hussmann, Heinrich; "Evaluation of User Interface Design and Input Methods for Applications on Mobile Touch Screen Devices," Human-Computer Interaction – INTERACT 2009. Lecture Notes in Computer Science, pp.243-246.
doi: 10.1007/978-3-642-03655-2_30
Url: http://dx.doi.org/10.1007/978-3-642-03655-2_30

Thursday, February 9, 2012

Article Review - "Massively Multi-user AR"

Howdy!

Today, I looked at a research paper from 2005 entitled Towards Massively Multi-user Augmented Reality on Handheld Devices. The Austrian team behind it was one of the first groups to try and get away from the cumbersome augmented reality setups that the field began with, including the backpacks, wearable desktop computers, goggles, etc. They identified three devices that are generally more accessible and socially fitting: PDAs, phones, and tablets. At the time of their research, 3D acceleration and high processing power was not as readily available as today, and powerful phones and tablets were scarce and extremely expensive, so they chose the PDA with an attachable camera add-on as the platform to conduct their research.

(system architecture, image from the paper)

The architecture they chose is displayed above. The only two components they had to build from scratch were PocketKnife, a hardware abstraction layer designed to be platform-independent and ease the development of graphical applications on mobile devices, and KLIMT, a renderer similar to OpenGL (which was not available for mobile devices at the time of the research). Every other system component was open-source, including the tracking software (OpenTracker), scene-graph renderer (Studierstube), and distributed networking framework (ACE); the team just put the pieces together in the right way.

(Invisible Train game, image from paper)

To actually test and evaluate this system, they built a game-like application (pictured above) that was designed for easy use by the general public, including children. Essentially, it was a multi-player train simulator where participants could choose to play collaboratively (trying to avoid hitting each others' trains) or competitively (trying to crash into each other). They launched the game in four locations simultaneously and supported several thousands of users over multiple days, which was an order of magnitude larger test group than had been previously used in AR research (according to the team). They gathered very informal evaluative information from the participants in the form of questionnaires and interviews, and on the whole there was a positive response. They felt they achieved a landmark goal in the quest of "AR anytime, anywhere".

Speaking from a present-day view, this research effort does seem to be a big step in the right direction, as there have been AR applications built since then on the concept of a massive user-base. It is also a positive takeaway for our own project that they managed to build this application using so many existing open-source components.

Wagner, Daniel; Pintaric, Thomas; Ledermann, Florian; Schmalstieg, Dieter; "Towards Massively Multi-user Augmented Reality on Handheld Devices," Pervasive Computing, 2005. Lecture Notes in Computer Science, pp.77-95.

doi: 10.1007/11428572_13
URL: http://dx.doi.org/10.1007/11428572_13

Monday, February 6, 2012

Article review - "Cultural Heritage Layers"

Howdy!

Today, I'll be looking at an article from 2009 that discusses an augmented reality application coined "Cultural Heritage Layers". The research team involved in this system had a problem with existing cultural heritage AR projects due to their lack of marker-less tracking systems and doubtful scientific accuracy, as well as the fact that they tend to be built on proprietary (and therefore unsustainable) software. For tackling the first issue, the team built a simple but effective two-step process that the system goes through to learn locations from reference images.

(figure taken from article)

First, randomized trees are constructed based off detectable "key points" of a video frame; these patches are then tracked based on their alignment to the reference image (the tracking is just done via a local search algorithm). This method seems pretty cool to me, although I wish the authors had gone into more detail concerning the feature-detection. They did claim that this method made testing and verification extremely easy, since you could basically just use cardboard cutouts to simulate large buildings.

As far as actually setting up the scene, they relied on a fairly lightweight framework called X3D that just needed minimal OpenGL and 3D acceleration - mostly because they only used 2D textures of historic buildings and scenes, as opposed to actual 3D models. An X3D scene is basically made up of the camera image in the background, a fixed filling quad to display the footage, and a flat ImageTexture node. The idea is that those who wish to set up their own scenes with various options for interactivity can just throw them on top of this node with an extremely small amount of code (the authors claim that only one line of X3D code is needed per point of interest). To demonstrate the system, they briefly described two in-use applications. The first, a visual look at the history of the Berlin Wall, allows users to cycle through historical images by touching the virtual scene on their mobile display. the other, an Italian heritage site called Reggia Vernaria Reale, actually renders the "real" scene differently depending on the image being displayed. For example, a particularly old black and white photo of Italian architecture will draw all people, environments, and real buildings as black and white so that the scene looks seamlessly enhanced with the virtual building.

(figure taken from article)

The work being done here could apply to our system in a few different ways, including the idea of interactivity, 2D image overlays, and tracking. What I really like about this Cultural Heritage Layers project is that it didn't introduce some dazzling new AR algorithm that involved laborious computer vision techniques, but instead focused on existing technologies, an interesting and profitable domain, and usability. Those are the features that AR work needs to improve upon if it is ever to be adopted on a mass scale.

Thanks and gig 'em!

Zoellner, M.; Keil, J.; Drevensek, T.; Wuest, H.; , "Cultural Heritage Layers: Integrating Historic Media in Augmented Reality," Virtual Systems and Multimedia, 2009. VSMM '09. 15th International Conference on , vol., no., pp.193-196, 9-12 Sept. 2009
doi: 10.1109/VSMM.2009.35
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5306012&isnumber=5305995

Thursday, February 2, 2012

Augmented Reality article review

Howdy!

Today, I'll look at a SIGGRAPH article from 2005 that discusses augmented reality (AR) interfaces by breaking them down into abstract components. The authors begin by stating that AR has not yet made significant progress beyond the first stage of new interface development - the prototype demonstrations. They claim that while there exist very interesting and intuitive systems for viewing three-dimensional information using AR, there is still very little support for actually creating or modifying content. This claim could probably be argued against in recent years with applications like AR graffiti being developed, but as a field, I still think there is a ton of work to be done in AR

content modification and creation. The article does do a great job of breaking down AR into three

distinct componentS: the physical elements, the display elements, and the interaction metaphor that

links them. The first two parts are probably the easiest to encapsulate and define, and as such,the

interaction metaphors are still fairly limited.

The authors attempt to deal with this limitation by sharing what they call "tangible user interfaces,"

and they go into details of a couple case studies of their own work. The first one, AR MagicLens,

allows users to view inside virtual datasets. The physical elements for this sytem are a simple paper

map for the virtual coordinates and a small ring-mouse that the user can hold. The defined

interaction metaphor is the ability of the user to hold a virtual magnifying glass and peer inside

virtual sets of data. An example they gave was a virtual model of a house that, when viewed with the

virtual magnifier, exposed its frame (also virtually). The second case study, mixed reality

art, basically allowed the user to combine real painting tools with virtual paint, bridge surfaces,

and polygonal models to create virtual works of art on real surfaces.

Both of these case studies are pretty cool, and it would make sense that they probably served as a

basis for other AR applications in the past few years. For our capstone project, I could see the

research of the "mixed reality art" being quite helpful.

That's it for today. Thanks and gig 'em!

article info:
Billinghurst, Mark; Grasset, Raphael; Looser,

Julian; "Designed augmented reality interfaces,"

ACM SIGGRAPH Computer Graphics - Learning through

computer-generated visualization, Volume 39 Issue

1, February 2005.

http://dl.acm.org/citation.cfm?id=1057803

Wednesday, January 18, 2012

Howdy!

Not much here yet (including a picture), but that will be fixed soon. Stay tuned!