Infrared and Vison frames
Freenect Camera Demo

Rhetorical Situation


Purpose

Conceived primarily as a project around the concepts of computer vision and its associated issues, the initial plan was to pair research into this subject area with work towards finding or implementing a software bridge between the C++ libraries of either OpenCV or Freenect with the Node.js JavaScript framework. This was seen as a natural extension of my previous work with computer vision, Node.js-supported projects, and was hoped to be an opportunity to explore writing, compiling, and linking software libraries on non-Windows systems.

Closely linked to this first cluster of goals was a secondary purpose of creating a game-like project that would be driven by the Kinect device, its inputs, and software designed around presenting the user with an avatar of themselves. In tying into the current wave of devices for gestural input for gaming consoles like the Xbox 360, Xbox One, and PlayStation 3 and 4 systems, the hope was to gain an appreciation of the technologies, challenges, and software involved in this process. In dealing with the lower-level software like the Freenect library, I had some expectations of seeing firsthand what was involved in using devices like the Kinect and in trying to process its signals toward some ludic end.

Should the project develop enough to be able to have active testing, the final purpose was to expose other academics to the field of computer vision. In first designing a plan for the project, the initial idea was to get to a point where software developed for the Kinect could be shown off or at least demonstrated remotely to distance student part of the class. And although this was achieved to some small degree by the end of the timeline established for the project, the plan was to spend more than a few minutes on example code in order to showcase how gesture-based interactions could be deployed beyond a gaming setting and perhaps even in a classroom environment.

Audience

Because of its default usage with the Xbox 360 and Xbox One systems, the primary audience was targeted generally as those not already exposed to the technology through gaming consoles or other similar peripheral system. However, as the project was also developed as part of a class, those groups represented in such a setting were positioned as the control audience expected to see the demonstration. This would include those who might be classified with limited weekly game playing experience and loosely defined as self-identifying female within the age range of 23 to 45 years old.

As a secondary audience, individuals within academia and within the age range of 22 to 55 were also seen as part of the intersecting group that might see the demonstration and lack previous experience with computer vision as it is presented through current and previous generation gaming consoles. Although potentially a very large collection of people, the project was initially to become part of the portfolio requirement for graduation in the Master of Arts program and thus could be seen by any wanting to examine my CV or as part of a listing of projects I had worked on as part of the program.

For a tertiary audience, the expectation was that I would write about the project on my personal blog and thus anyone who was subscribed to my RSS feed would see and potentially read about the project. However, as the project and research progressed, and the process slowed tremendously, this was set aside as something to be done, if at all, outside the timeline established. However, it does remain an active if prolonged goal once some progress made can be reported.

Subject

As written about previously, the topic of research was concerned with the issues and challenges in using the Kinect device to A) show a user themselves as represented through a computer’s vision and computation of their located self within the camera’s field of vision and B) expose those users unexperienced to similar devices to how gesture-based technologies could be used in the future. More generally, this might be classified as computer vision in practice, too, although the project was much more narrowed in scope than to encompass that whole field of research.

In fact, much of the research ended up bring in and around the issues surrounding the licensing of software libraries and the conflicts between the two very different views of software access as expressed through the OpenKinect project with its freely available Freenect library and that as expressed by Microsoft through its terms of service and end-user licensing agreements with its operating systems, .Net framework, and Kinect SDK. While one, OpenKinect, sought to allow users to control their device, the other was designed around controlling users themselves and curtailing what was and was not allowed to happen in connection to their device.

Context

Much as the project was planned around a primary audience of those as part of the class in which the project was created and work on, the setting for the project was of a similar nature. It was to be presented either within the classroom itself (as it was, to some degree) or as taking place within the Media Park on the Norfolk campus of Old Dominion University. For both, the challenge was in trying to find and maintain both the correct running environment for the code and the distance needed between the device and those it is gazing upon for the duration of the demonstration.

To be able to perform the demonstration, visual access to the output of the software’s computation and direct input from the Kinect is needed. Because of the nature of the device, in that individuals must be present and within visual or auditory range of the device, it requires a specific physicality that makes it hard to demonstrate remotely. To use the Kinect, as with most other gesture-based technologies, the user must be physically facing the device in order to use it. Although the output can be sent to an terminal with access to the information feeds, there always needs to be some presence for the device to work as intended.

Design

Most of what was reviewed or learned about design during the span of the project was much more in the clash between software philosophies and how such paradigms impact the end-user during their connection to them. While not always thought of in a traditional sense as designed, the operating systems of both Windows-based and MacOS X systems constitute a profound sense of being planned and executed around design principles. From the lowest level of kernel code up through the user interface, each level interlocks with another to build a powerfully structured system of layers and ways of thinking about how a user should interact with software.

Principle to this observation was the journey to initially use of the Freenect library and its perils around the way it works. Through downloading source code and compiling it on the machine, I was able to load USB drivers and use low-level code without much in the way of software gatekeepers on a MacOS X system. Through command-line access, I could change some fundamental aspects of the operating system with only a few words and some patience. As long as I stayed in this environment and did not venture out too much — XCode, the official way to compile code, is much more confining — I could get much done.

This is contrasted with Microsoft’s approach to compiling code, which requires either the acceptance of licensing terms to use their Visual Studio product suite or installing open sources tools with reduced functionality to do the same thing. Everything downloaded and installed through official channels and for official access to use the Kinect device, however, included multiple agreements with various terms of service contracts.

Both companies presented different trust cultures and thus underlining design ideas. While MacOS X would allow nearly any change via the command-line interface, it would often hide system settings behind obscured or even obfuscated methods, showcasing extreme trust in its power users, but hiding things from those without the same level of knowledge. Windows, on the other hand, showed a lack of trust in anyone but administrators, but allowed them free reign to most changes as long as they were willing to agree to the terms offered and did not step outside of the boxes provided by Microsoft.

Project

An honest assessment of the progress of the project would include the following word featured prominently: frustration. Most of the early time on the project was spent in tracking down resources to at first use the Freenect (and later OpenCV) libraries and then trying to make a bridge between them and the Node.js framework. I found multiple projects and individuals that had achieved what I wanted to various degrees of success in the past, but was unable to find anything current that had not already been discontinued do to changes in the API of one of the libraries or simply because it would no longer compile. (There are several Node.js modules out there, for example. However, none that currently work.)

Along with the word frustration would also be “lack of time” in some large font and followed by an angry exclamation point. By the time I had decided to indulge Microsoft in its (stupid) games to get the Kinect working, I had already used up most of the time on the project and was forced to jump not only operating systems, but software libraries and even programming languages. This set back the project so far that I was unable to do much more than record that I hated Microsoft and it was a stupid-head for making me jump through its hoops to get to the point of merely running example code at the very tail end of the project’s timeline.

Theory

In first conceiving the project, their was hope of using Hayle’s continuum of pattern-randomness from her book How We Became Posthuman to locate the role that materiality plays in having a presence for the Kinect to compute. By not only having a body but also being a part of a computer’s vision, the theme of presence and absence (first brought up in New Media: The Key Concepts as part of our class) was planned to play a prominent role in locating the theory behind the project. By comparing the person from their infrared impact with the “normal” vision presented by another camera as part of the Kinect, it is able to ‘see’ and locate people by their relative distance and warmth.

To be ‘seen’ by the Kinect, then, becomes a ratio of person to background, a calculation that reduces a person to a percentage of how much a person is part of an image. Such reductionary computations track with Gane and Beer’s presentation of ideas (in New Media: The Key Concepts) from Manovich’s in how, in locating works in a digital sense, we should always realize the numerical representation of the medium (echoed again with greater emphasis in his work Software Takes Command). All data on a computer is nothing but a number and all perceptions from a computer’s vision are nothing but algorithms measuring these against some outside stimulus.

If we are looking at the output of the Kinect, we are also seeing not only ourselves, but ourselves as presented by the machine. In others words, and to locate Lingua Fracta in this, we are looking at ourselves as seen through the algorithms applied to the visual input. In many ways, we can thus become the subject of our own gaze in trying to configure both an appealing image of our computed self, but in also conforming to its command. We have to match how the machine expects the input to be, standing not too close or too far, and within its field of view, too, as we watch ourselves watch ourselves through the Kinect’s “eyes”.

Learning

Technologies:

  • MacOS X 10.10 (laptop)
  • Windows 7 (desktop) and 8.1 (laptop)
  • Freenect library
  • Various Node.js modules tested
  • Kinect SDK
  • Kinect (version 1 device)

Coming off the experiences of jumping between operating systems (and their paradigms), I cannot write I am eager to hop into another project that offers the same challenges. While I found nothing inherently wrong with using MacOS X, I did spend much of the project also learning the operating system at the same time, having not used it since several years previous. In many ways, then, I was also learning not only how to play nice with the Kinect, but also the problems with moving between Windows and MacOS X systems (among them being not only file formats, but also file systems, too).

Along those same lines, I would also like to revisit using the Kinect at some later point as well. I really do think, despite all the problems I personally had, that it is possible to bridge the Freenect and Node.js systems (after all, people have done it in the past), but was unable to find a working solution in the time allotted to the project. While I do not know if I would be able to create what I planned for initially, I do have some hope left that it should be possible to make a game out using the Kinect and other software, be that the JavaScript wished for or via some C# Microsoft-blessed code.

Leave a Reply

Your email address will not be published. Required fields are marked *