Supporting astronauts with augmented reality.
Table of Contents |
---|
Week 9 Snippet |
Week 8 Snippet |
Week 7 Snippet |
Week 6 Snippet |
Week 5 Snippet |
Week 4 Snippet |
Week 3 Snippet |
Week 2 Snippet |
Week 1 Snippet |
Hoo boy. You saw last week’s Next Steps. You know this week is our last week in class with demo day looming at March 14th, and even though our team will continue to work on our project before the NASA SUITS challenge concludes in a few months, we want to finish off the goals we set towards the beginning of the quarter. We put our heads down, cracked our knuckles, brought some caffeine, bought some 2 AM pizza, and got to work.
In the name of performance, Karman and Muskan set off to try and host our model on FireBase. Didn’t pan out, as they couldn’t work out the kinks.
Soon after, Adrian thought of the brilliant idea to host on attu, UW’s student-available remote Linux machines - and to our excitement, it showed promise! The HoloLens could send and receive byte payloads to a python script hosted on attu using C# and python’s sockets API. However, that in itself turned out to be a massive struggle to get fully working in the end. Bytes were lost in transit, data formats were mismatched, but once we swapped from UDP to TCP, our main blocker fell away, since our transmissions would be reliable.
Our image-processing pipeline with HoloLensCameraStream looks like this: 1) Retrieve the next available camera sample from HoloLensCameraStream, 2) Apply image segmentation to it on attu, 3) Apply color reduction for distinct boundaries between objects, 4) Identify the bounding boxes of rocks in the color-reduced images, and 5) Project the pixel coordinates of the rocks’ centers into world space and place GeoPoints at those locations to represent each rock. Adrian implemented (1). On (2), Karman worked on the TCP server python script to host the model on attu while Adrian fixed that into the final version as well as implemented the respective TCP client script in Unity.Karman wrote the color reduction code for (3), which was refined and adapted into C# and Unity by Marcus. Marcus wrote the algorithm for locating rocks and calculating the pixel coordinates of their centers. Adrian took advantage of the HoloLensCameraStream utility functions to implement (5), while structuring the codebase for the entire five-step process. Adrian also impelemented an additional feature, where, for each GeoPoint, the image of the rock cropped from the original camera image is displayed on the GeoPoint’s panel.
As a result of implementing HoloLensCameraStream, our passive frame rate outside of scanning was drastically increased, maybe from ~10 fps to ~40 fps. After pulling all-nighters to get it done, implementing the TCP server/client code for attu and Unity decreased our scanning time from a 30-60 second application freeze to an asynchronous 2 second delay. In the midst of all this, we drafted up our final presentation and video for the class and demo day. All around, a productive week.
Continuing into next quarter, our main focus would be to dive into the geological analysis side of things, now that performance issues are resolved for now. As we may have mentioned previously, we’re currently envisioning a system where the user essentially defines what a “geological point of interest” is themselves, telling the app what rock features they’re seeking (e.g. coarse-grained, basaltic, etc). With those features in mind, our feature extraction model will analyze the nearby geology via computer vision and spawn waypoints at the location of matching rocks. But, for now, we’ll take it easy over spring break. Thanks for your support! We appreciate you!
Due to some private concerns, Karman was out of commission for most of the week, although still keeping up best he could by experimenting with different models and model parameters, such as input resolution, for the sake of performance, although results are pending. Meanwhile, Adrian, Marcus, and Muskan attacked the performance issue with other potential solutions and created a hand floating list menu in order to view and manage your existing GeoPoints.
Although away from school due to private concerns, Karman looked into alternative model architectures and parameters, including image resolution, to improve performance on model execution with the HoloLens.
Muskan and Adrian collaborated on the list UI, which features user-following functionality and displays a list of your GeoPoints, with a delete button next to each list entry that allows you to delete the corresponding GeoPoint. Developing this piece of UI was a real undertaking, as features such as list-scrolling were filled with nuanced issues, and the list had to have knowledge of all existing GeoPoints and be able to delete them.
Marcus fixed an issue related to starting/stopping the camera before and after GeoScans, and cleaned up a memory leak caused by a lack of tensor disposal. He also set up a separate branch to begin testing out usage of the HoloLensCameraPlugin.
Adrian implemented a technique for recycling GameObjects called object pooling in order to decrease lag upon spawning a GeoPoint, to limited success. He also updated Barracuda from 2.0.0 to 3.0.0, the latest version, with some benefit to model execution performance. However, his attempts to implement asynchronous model execution as well as schedule execution, where the model is scheduled to incrementally execute over multiple frames, failed; he could not implement them correctly. Additionally, Adrian helped out with implementing the HoloLensCameraStream plugin, studying the provided example project and drafting up some scripts that take advantage of its camera-input-providing and pixel-to-world coordinate-coverting functionalities, although they have yet to be tested.
After getting Karman’s image segmentation model up and running last week, we narrowed down our major priorities to improving our app’s horrible performance and adding some real geological analysis. Before that, though, the team worked to implement some UI changes that were suggested on our feedback form.
Muskan and Adrian implemented a 3 second countdown that appears before all GeoScans, allowing users to aim their headset and prepare for the scan. A camera shutter sound effect now plays at the end of the countdown. Also, the UI panels representing GeoPoints were made to face the user at all times for visual clarity and ease of use.
Adrian added a delete button to the floating GeoPoint panels as well as ensured that Karman’s model and the new UI changes were able to be built and run on the HoloLens. Here, we found that performance was limited to about 2 frames per second and a 30 second freeze upon executing every GeoScan.
Marcus improved performance by only enabling reading textures from the HoloLens camera at the moment we perform the GeoScan. He also added a small reticle in the center of the user’s view that indicates where in world-space the GeoScan will occur.
Karman, delving into the geological analysis, researched different methods of post-processing segmented images to identify geological points of interest, going with the provisionary assumption that a larger number of small rocks is more viable for research than having no rocks or larger rocks. We’re still waiting on the geology professor we’re in contact with to get back to us; then, we might know better indicators of promising research locations.
We’re taking our minimum viable product and starting to move towards our target product in these last few weeks. Getting Karman’s model to work with our project is the highest priority, as that’s the core of our project. However, Marcus was put out of commission due to some unfortunate food poisoning this week, so progress was slowed a bit.
Karman and Adrian finally resolved got Karman’s image segmentation model to display segmented images! Although the results are slightly different than when displayed in Python using MatPlotLib. Karman had compared the Python and Unity output tensors from running the Barracuda and original models on the same image, and they were different; next, we compared the input tensors for that same image, and they were different! Turns out, the model required RGB values in the range 0-255, while the methods we used in Unity used the range 0-1.
Meanwhile, Muskan is learning about the basics of UI in Unity, as much of our feedback from last week mentioned a lack of audio/visual communication with the user regarding when the scan will occur, where the GeoPoint will appear, and more.
Crunch time for our midterm presentation and feedback sessions. In all, we were able to finish up a working demo that features an image segmentation model, operational hand menu, and floating UI to display the segmented image. Karman’s model wasn’t able to be implemented with the Unity project, but that wasn’t necessary for the feedback sessions, for which we wanted to focus on our app’s user experience.
After finishing up his rock image segmentation model, Karman worked into the night with Adrian to get his model working in our Unity project, testing out different images and such to narrow down the issue, but ultimately to no avail. They’ve found some leads and should be able to resolve the issue soon.
On a side note, Muskan put together a great logo for the team, which you can now find on the website.
Adrian, Marcus, and Muskan brainstormed what UI might be best for users to scan objects and to present geological information to them. They tested out different ideas, one being a fixed UI panel in the top left of the screen that displayed information from the most recent geological scan, but they soon found out that fixed UI in AR often feels obtrusive and bothersome. They settled on a hand menu that offers a button to manually scan and another to toggle an automatic scan on a timer, as well as a floating, movable and rotatable world-space panel to display each scan’s geological information. Implementing this was very time-consuming, as we only found out late into our development that MRTK 3 prefabs, particularly the hand menu, do not work out of the box with our current MRTK environment, which was a huge point of frustration. Switching to MRTK 2 prefabs had things working instantly.
Everyone helped put together the midterm presentation for Tuesday and gather user feedback during our Thursday feedback sessions.
Just getting Karman’s model to run correctly in Unity using Barracuda.
Over the past couple weeks, the team was somewhat confused about the exact goal of our computer vision model, having to wait on additional mission details from NASA about their NASA SUITS challenge. We had assumed that our model was to be used to identify nearby geological points of interest after analyzing images of the nearby geology.
After attending a mission debriefing presented by NASA concerning the technical details of the NASA SUITS challenge on Thursday, we discovered that the geological points of interest during the actual testing of our software are to be predetermined. In addition, the rocks to be scanned are likely also predetermined, having RFID tags attached to them. If this is true, applying our model to this test would be redundant. What now?
Thankfully, our instructor clarified that our team should not be concerned with gearing the model towards helping with the test and that our project is moreso a research objective, a proof-of-concept that is part of the team’s proposal to NASA. Our model may not help with completing the test admistered by NASA, but, if our project is successful, it would help astronauts on a real research expedition on the moon.
With this in mind, the team can confidently continue building the model with same original purpose in mind. With limited time, we’re striving to have a real minimum viable product ready before the midterm presentation next week.
The first half of the week was slow-paced, as the team was still unsure as to what direction we should take with our project.
Marcus researched various pre-existing image segmentation application geared towards the HoloLens and implemented some of them for comparison. The team settled on an application that leveraged Barracuda, a library that allows custom neural networks to be easily implemented in Unity. Marcus also fixed the portrait images on our website!
Karman spent many hours constructing a custom working model for image segmentation based on transfer learning and the U-Net architecture that can segment smaller rocks from larger rocks and the image background. He ran into some problems with lacking RAM on Google Colab, and so we spent some of our budget on Google Colab Pro.
Muskan searched for existing models and datasets that could come in handy and settled on an artifical lunar landscape dataset found on Kaggle.
Adrian combed through several papers describing existing rock identification neural networks in detail, which could inform the design of our own model. Several complications were mentioned in the papers that are worth keeping in mind during development, including lighting, weather, visual obstructions,focal distance, rock similarity, and hardware processing power, Each of these may interfere with rock scanning and model speed/accuracy. Adrian did some research regarding lunar geology, since it may inform what our rocks our model needs to train on.
Both Karman and Adrian went and sought out a resident expert on lunar geology, looking around UW’s Johnson Hall. They’re waiting to settle an appoinment with a certain professor so they can ask questions about what geological visual features are important to for our model to extract in order to determine if a rock is unusual and worth further research.
None.
Week 3 brought with it a lot of unforeseen setbacks, Karman taking an indefinite break to take care of personal issues and Adrian catching COVID-19, but nothing that can’t be overcome with some hard work and elbow grease.
Muskan found a potential dataset that has images labeled by what kind of rock they are, but the selection of rocks is very limited. To train our model, we may need to research what kind of rocks and minerals can be found on the moon and train our model accordingly.
Marcus started working out the details on how to export camera input/spatial data from the Hololens 2 and bridging it to the model. Progress on this will likely need to continue to next week.
In Karman’s absence, Adrian and Muskan are working to get started on the actual model. A research paper they found by Ran et al. features a rock classification deep convolutional neural network that seems very similar to our needs, and it describes much of their design decisions, which should help inform our own. This should be a great starting point before we adjust the model for our own needs, although we’re not entirely sure what needs exactly are due to missing some mission details.
COVID begone, so that Adrian can rejoin the group in person.
Still waiting on mission details from NASA so we can decide on an exact goal for our model.
Closing out the planning phase of the project, we ironed out the details and timeline of the project in a requirements document available here.
Weeks 4 through 6 will be spent implementing the MVP for our computer vision model and corresponding UI while gathering the necessary mission and geology details, and weeks 7 through 9 will involve adding additional quality-of-life features. Testing of existing and new features will take place throughout the entire process.
All four group members discussed the details of the project requirements document together, which was the bulk of our effort.
Meanwhile, Karman and Marcus combed through the NASA SUITS shared drive given to us by the UW Reality Lab for mission details related to our geology task and the exact scope of our work relative to the team’s. Being in charge of inventing the model, Karman looked further into resources involving spectroscopy and people we can contact to learn about rock and mineral identification.
Aside from that, Muskan and Adrian put their heads together to try and debug the website’s image issues, which still is a work in progress.
We’re waiting for a meeting with the UW Reality Lab on January 23rd so we can ask some questions related to integrating our UI with theirs, as well as set up communication for the future. Other than that, no particular blockers.
Our first week together was spent onboarding with the UW Reality Lab team, learning the details of the NASA SUITS challenge and what features we were tasked with implementing. And so, we narrowed our focus to the application of computer vision to spectral geology. All the while, we were getting acquainted with VR/AR development and the course structure.
Working with Jekyll and Github-Pages, Adrian and Muskan got this team website up and running. Meanwhile, Karman began looking into existing literature regarding computer vision and spectral geology and usable training data for the machine learning behind spectral analysis. Marcus organized group communication so that we can keep a team workflow going, and both Karman and Marcus read through the NASA SUITS mission description to help us get a handle on our exact project requirements.
We’d like to set up communication with the UW Reality Lab NASA SUITS team and perhaps get some advice to help us get started on research and development.
Questions to ask: