How Visual SLAM creates cameras that see

How does a robot know where it is when it doesn’t have eyes? If it’s built with Visual SLAM, then it uses specially adapted Canon cameras to see.
A 3D illustration of a warehouse, with shelves on left and right stacked high with brown cardboard boxes. Moving down the middle, between the shelves, is a yellow and black flat robot with a box on top of it, being transported down aisle.
SARAH VLOOTHUIS HEADSHOT

Written by Sarah Vloothuis

Senior Manager External Communications

How can a robot see when it’s about to bump into something? Well, technically it can’t because robots don’t have eyes, silly. But they do have the ability to calculate where they are in space through a combination of technologies. So, while it’s not actually sight, it is pretty close and, when you think about it, also incredibly cool. After all, without any kind of vision, even the smartest robot is useless, isn’t it?

And this is why robotics remains one of the biggest technology challenges today. It requires a combination of many different areas of expertise – from multiple disciplines of robotics, control, mechanical and electrical engineering, to skills in software development, materials, mechatronics and more. Each element must operate smoothly within the whole and it’s a delicate balance. The way the robot moves, for example, is dictated by the materials used to build it and how each of these components operate together, how they are made to move and how they are powered. But if you want a robot to move independently, how can it do so unless it has a way to ‘see’ where it is going?

Of course, optical and imaging technologies are what we do best, so it will come as no surprise that Canon has been working on this particular area of robotics for quite some time. Over thirty years, in fact. Today, it’s called Visual SLAM, which is ironic considering that ‘slamming’ into anything is precisely what the technology prevents. SLAM or ‘Simultaneous Localisation and Mapping’ (which is less catchy but certainly more accurate), is a technology that simultaneously estimates the position and structure of a robot’s surrounding environment. The original Visual SLAM system was created to merge real and virtual worlds in a head-mounted display, which we now know as Mixed Reality. Today, Visual SLAM could be used across all kinds of automated tasks in industries from manufacturing and hospitality to healthcare and construction.

A hospital room containing a medical bed. Next to the bed is an IV pole. On the other side, a blue plastic cupboard with a drawer. A telephone sits up on it. There is a floor to ceiling window to the left of the bed and light shines through.

“Visual SLAM could be used in medical facilities serve food and medicines, where high risk patients require the absolute minimum of contact to keep them safe”

How does Visual SLAM work?

Mobile robots, such as AGVs (Automated Guided Vehicles) and AMRs (Autonomous Mobile Robots) are already a familiar sight in warehouses and logistics operations and often they are guided using a track of magnetic tapes affixed to the ground. As you might imagine, this is expensive, takes time to install and, crucially, creates inflexibility. If the AGVs and AMRs can only operate along a fixed route, what happens when the route needs to be changed? Or if a business needs to be able to shift operations quickly? So, a ‘guideless’ approach is highly desirable in a world where, let’s face it, things change.

The answer is to use one of the two kinds of SLAM. The first is LiDAR, which stands for ‘Light Detection and Ranging’ and uses laser pulses to measure distances and the shapes of surrounding structures. While LiDAR systems are great in that they can work in dimly lit areas, they generally use sensors that only perform horizontal scanning, which immediately limits the information a robot can obtain to two-dimensional surfaces. This isn’t because three dimensional is impossible – it’s just incredibly expensive. The other problem with LiDAR is that if there isn’t enough for the robot to ‘see’, then 3D objects need to be installed around its route.

Instead of lasers, Visual SLAM by Canon uses cameras as sensors, which is cheaper than LiDAR, but still delivers high-precision measurement. A combination of video images and a proprietary analysis technique identifies the 3D shapes of structures and together this information creates the ‘localisation’ part of SLAM’s name. Amazingly, this even extends to objects with flat surfaces, such as posters on walls, so doesn’t need any additional 3D objects installed, as LiDAR does. This also means it can be used in a lot more places and situations and because it can be used for image recognition too, there are other ways to put Visual SLAM to use, such as drones or service robots.

“Canon has been working on this particular area of robotics for quite some time. Over thirty years, in fact.”

How does it understand change?

Because the space in which the AGVs and AMRs operate are highly changeable, Visual SLAM needs to be smart too. The images obtained by left and right stereo cameras are continually processed by Canon’s ‘Vision-based Navigation Software for AGV’, which makes the images into real-time 3D maps and updates them automatically. This is a huge amount of precision information to be processed, but it is designed to be undertaken in in real-time – even on a low-end computer. The constant nature of capture and processing means that robots using Visual SLAM can essentially ‘navigate’ all by themselves.

This makes it ideal for robots in all sorts of spaces, particularly those where humans might be exposed to danger. For example, transporting hazardous materials inside chemical plants or anywhere where ‘contactless’ movement of products is required for human safety. It has even been suggested that robots using Visual SLAM could be used in medical facilities serve food and medicines, where high risk patients require the absolute minimum of contact to keep them safe as they undergo treatment. Such environments are dynamic and fast-moving, so a robot’s ability to ‘learn’ as it works is essential.

As with all interdisciplinary technologies, as one element progresses it opens opportunities for others and naturally, we can expect the same with robot vision. After all, our optical, sensor and image-processing techniques have been refined through the development of our camera and lens products. Creating robot ‘eyes’ that are affordable and accessible by all industries is just one step closer to the kind of automated solutions that make everyday life safer, more comfortable and more convenient for us all.

Discover more about Canon’s Visual SLAM technologies on the Canon Global Website.

Read more articles like this from Canon VIEW