Can Machine Learning Improve Robot Kinematics?

I’ve tried several times to hand-code inverse kinematics for robot industrial arms in Robot Overlord.  To make a long story short, there are a lot of complicated edge cases that break my brain.  Many modern methods involve a lot of guess work in the path planning.  I know that a well trained Machine Learning agent could do the job much better, but to date there are none I can download and install in my robot.  So I’m going to try and do it myself.  Join me!

The problem I’m trying to solve with Machine Learning

I have a 3D model of my arm in the Java app called Robot Overlord.  The 3D model is fully posable.  At any given pose I can calculate the angle of every joint and the exact location and orientation of the finger tip.  I have Forward Kinematics (FK) which is a tool to translate joint angles into finger tip.  I have Inverse Kinematics (IK) which is a tool to translate the reverse.

A robot arm is programmed with a series of poses.  Go to pose A, close the gripper, Go to pose B, insert the part, Go to pose C, etc… The robot software has to calculate the movement of the arm between poses and then adjust every motor simultaneously to drive the finger tip along the path between the two poses.  I’ve already solved the firmware part to drive six motors given sets of joint angles.

The problem is that one IK solution there might be many combinations of joint angles – sometimes infinite solutions.  To illustrate this, hold a finger on the table and move your elbow.  Your finger tip didn’t move and you had lots of possible wrist/shoulder changes.  As the arm moves through space it can cross a singularity – one of the spots with infinite solutions – and when it comes out the other side the hand-written solution flips the some or all of the arm 180 degrees around.  A smarter system would have recognized the problem and (for instance) kept the elbow to the side.  I have tried to write better IK code but have not had any success.

My Machine Learning Plan of Attack

My plan is to use a Deep Learning Neural Network.  The DNN is a bit of a black box: on one side there is a layer of Inputs, on the other there is a layer of Outputs, and in between there are one or more hidden layers.  Inputs filter through the layers and come out as Outputs.  The magic is in the filtration process!  The filter can be trained with gradient descent using a cost function – if I can score every input/output combination I can let the DNN play with the virtual arm while the cost function watches and says “good, bad, better, worse” until the two work out all the best possible movements.

My Machine Learning Network design

I believe my inputs should be:

  • Arm starting pose: 6 random angle values.  Because DNN inputs are values in the range 0…1 I’ll say 0 is 0 degrees and 1 is 360 degrees.
  • Arm ending pose: 3 random position values and 3 random angle values.  Position values are scaled over the total movement range of the robot.  So if the robot can move on the X axis from -50 to +50, (x/100+0.5) would give a value 0…1.
  • Interpolation between both poses: 1 decimal number.  0 means at the start pose and 1 means at the end pose.

I want my outputs to be:

  • Arm joint angles: 6 angle values.
  • confidence flag: 1 number.  0 means “I definitely can’t reach that pose” and 1 means “I can reach that pose”.

The cost function should work in two steps:

  1. make sure there is no error in the joint value – that is to say, the finger is actually on the path where it should be, and
  2. seek to reduce joint acceleration.  Adjust the elbow and the wrist ahead of time to avoid the need to suddenly twist.

I’m going to try first with two hidden layers, then experiment from there.  I intuitively guess it will take at least two layers because there are two parts to the cost function.

My Machine Learning code setup

Robot Overlord source code is already written in Java so I’ve added TensorFlow and DL4J.  Currently I’m still walking through the MNIST quickstart tutorials and asking the DL4J chat room for help.  They already solved a few head scratching differences between the DL4J quickstart tutorials and the DL4J up-to-date examples.  You can find my first test in Robot Overlord’s code at /src/main/java/com/marginallyclever/robotOverlord/


I hope that I’ve described my challenge thoroughly.  Please feel free to look at the code and make pull requests, or comment below or in the forums with any tips and advice you might have.  If you’re feeling helpful but not sure how, please share this far and wide so that I can reach the people who have the DNN know-how.

Stay awesome!

Only registered users can comment.

  1. Hi, Dan.

    I was looking for a mechanical arm to apply my knowledge of Reinforcement Learning and DNN.
    Just by chance came across your post, and found that you are looking for people with DNN “know-how”.
    The thing that you are trying to do looks sensible for me (I am going to work myself in the same direction),
    but you will have to take it to the next level to make it work.
    What you try to do describing as: ” I can let the DNN play with the virtual arm while the cost function watches and says “good, bad, better, worse””
    is called Reinforcement Learning. And there is a lot of progress in this field recently, check the work of DeepMind, then imagine how you can translate it from
    domain of games to robotics.
    There is a very long way from MNIST to Reinforcement Agents (you can check some OpenAI projects).
    Anyway if you are interested we can continue discussion and maybe establish some collaboration.

    Best regards
    Densiov Alexey, Ph.D.