Friday, October 8, 2010

3D Picking in Android

Introduction

In this short tutorial I’m presenting something that’s made me loose weeks of work. How to implement picking with a perspective camera in the Android platform using OPENGL 1.0.

The process of picking basically involves the user clicking a point in their device screen, we take that point and apply the inverse transforms that opengl applies to it’s 3D scene, and so get a point in the world coordinate system (wcs) that is where the player wanted to click. For the sake of simplicity, we will work on a simple 2D map, instead of having to cast a ray to intersect multiple objects.

Usually, in opengl we would use the function glUnProject to un-project the point and so get the wcs equivalent point, but that function is plagued by errors on the Android platform and it’ very difficult to get the gl transformations for the projection and model matrixes.

Algorithm

So here is my solution. It might not be perfect, but it actually works.

Code Snippet
  1. /**
  2.     * Calculates the transform from screen coordinate
  3.     * system to world coordinate system coordinates
  4.     * for a specific point, given a camera position.
  5.     *
  6.     * @param touch Vec2 point of screen touch, the
  7.       actual position on physical screen (ej: 160, 240)
  8.     * @param cam camera object with x,y,z of the
  9.       camera and screenWidth and screenHeight of
  10.       the device.
  11.     * @return position in WCS.
  12.     */
  13.    public Vec2 GetWorldCoords( Vec2 touch, Camera cam)
  14.    {  
  15.        // Initialize auxiliary variables.
  16.        Vec2 worldPos = new Vec2();
  17.        
  18.        // SCREEN height & width (ej: 320 x 480)
  19.        float screenW = cam.GetScreenWidth();
  20.        float screenH = cam.GetScreenHeight();
  21.               
  22.        // Auxiliary matrix and vectors
  23.        // to deal with ogl.
  24.        float[] invertedMatrix, transformMatrix,
  25.            normalizedInPoint, outPoint;
  26.        invertedMatrix = new float[16];
  27.        transformMatrix = new float[16];
  28.        normalizedInPoint = new float[4];
  29.        outPoint = new float[4];
  30.  
  31.        // Invert y coordinate, as android uses
  32.        // top-left, and ogl bottom-left.
  33.        int oglTouchY = (int) (screenH - touch.Y());
  34.        
  35.        /* Transform the screen point to clip
  36.        space in ogl (-1,1) */       
  37.        normalizedInPoint[0] =
  38.         (float) ((touch.X()) * 2.0f / screenW - 1.0);
  39.        normalizedInPoint[1] =
  40.         (float) ((oglTouchY) * 2.0f / screenH - 1.0);
  41.        normalizedInPoint[2] = - 1.0f;
  42.        normalizedInPoint[3] = 1.0f;
  43.  
  44.        /* Obtain the transform matrix and
  45.        then the inverse. */
  46.        Print("Proj", getCurrentProjection(gl));
  47.        Print("Model", getCurrentModelView(gl));
  48.        Matrix.multiplyMM(
  49.            transformMatrix, 0,
  50.            getCurrentProjection(gl), 0,
  51.            getCurrentModelView(gl), 0);
  52.        Matrix.invertM(invertedMatrix, 0,
  53.            transformMatrix, 0);       
  54.  
  55.        /* Apply the inverse to the point
  56.        in clip space */
  57.        Matrix.multiplyMV(
  58.            outPoint, 0,
  59.            invertedMatrix, 0,
  60.            normalizedInPoint, 0);
  61.        
  62.        if (outPoint[3] == 0.0)
  63.        {
  64.            // Avoid /0 error.
  65.            Log.e("World coords", "ERROR!");
  66.            return worldPos;
  67.        }
  68.        
  69.        // Divide by the 3rd component to find
  70.        // out the real position.
  71.        worldPos.Set(
  72.            outPoint[0] / outPoint[3],
  73.            outPoint[1] / outPoint[3]);
  74.          
  75.        return worldPos;       
  76.    }

In my case, I’ve got a render, a logic and an application thread, this function is a service provided by the render thread, because it needs the gl Projection and ModelView matrixes.

What happens is the logic thread sends a touch (x,y) position, and the current camera (x,y,z, screenH, screenW), to the GetWorldCoords function, and expects the world position of that point taking into accound the camera position (x,y,z), and the view fustrum (represented by the projection and modelview matrixes).

The first lines get the data ready, create auxiliary matrixes and access camera data.

One important point is the line

int oglTouchY = (int) (screenH - touch.Y());

This inversion is needed because android screen coordinates assume a top-left coordinate system, and opengl needs a bottom left. So we change it. And with that we can start doing the picking algorithm.

  1. Transform the point from screen coordinates (ej: 120, 330) to clip space (for a 320 x 480 android, this would be –0.25, 0.375)
  2. Get the transformation matrix (projection * modelView), and invert it.
  3. Multiply the clip-space point times the inverse transformation.
  4. Divide the coordinates x,y,z (positions 0,1,2) times the w (position 3)
  5. You’ve got the world coordinates.

Notes:

The z doesn’t appear because I don’t have need for it, but you can get it easily (outPoint[2] / outPoint[3]).

The situation I’m working on is the following. The red and blue are the frustum limits, the green is the world map, at an arbitrary point in space, and the camera is at the tip of the view frustum.

80880607

There is one very special complication when doing this picking algorithm in the android platform and that is accessing the projection and model view matrixes opengl uses. We manage with the following code.

Code Snippet
  1. /**
  2.     * Record the current modelView matrix
  3.     * state. Has the side effect of
  4.     * setting the current matrix state
  5.     * to GL_MODELVIEW
  6.     * @param gl context
  7.     */
  8.    public float[] getCurrentModelView(GL10 gl)
  9.    {
  10.         float[] mModelView = new float[16];
  11.         getMatrix(gl, GL10.GL_MODELVIEW, mModelView);
  12.         return mModelView;
  13.    }
  14.  
  15.    /**
  16.     * Record the current projection matrix
  17.     * state. Has the side effect of
  18.     * setting the current matrix state
  19.     * to GL_PROJECTION
  20.     * @param gl context
  21.     */
  22.    public float[] getCurrentProjection(GL10 gl)
  23.    {
  24.        float[] mProjection = new float[16];
  25.        getMatrix(gl, GL10.GL_PROJECTION, mProjection);
  26.        return mProjection;
  27.    }
  28.  
  29.    /**
  30.     * Fetches a specific matrix from opengl
  31.     * @param gl context
  32.     * @param mode of the matrix
  33.     * @param mat initialized float[16] array
  34.     * to fill with the matrix
  35.     */
  36.    private void getMatrix(GL10 gl, int mode, float[] mat)
  37.    {
  38.        MatrixTrackingGL gl2 = (MatrixTrackingGL) gl;
  39.        gl2.glMatrixMode(mode);
  40.        gl2.getMatrix(mat, 0);
  41.    }

 

The gl parameter passed to the getCurrent*(GL10 gl) functions is stored as a member variable in the class.

The MatrixTrackingGL class is part of the android samples, and can be found here. Some other classes must be included for it to work (mainly MatrixStack). The MatrixTrackingGL class acts as a wrapper for the gl context, but providing the data we need. For it to work, our custom GLSurfaceView class must have the GLWrapper call, something like this.

Code Snippet
  1. public DagGLSurfaceView(Context context)
  2. {
  3.     super(context);       
  4.         
  5.     setFocusable(true);
  6.         
  7.     // Wrapper set so the renderer can
  8.     //access the gl transformation matrixes.
  9.     setGLWrapper(
  10.     new GLSurfaceView.GLWrapper()
  11.     {
  12.         @Override
  13.         public GL wrap(GL gl)
  14.         {
  15.             return new MatrixTrackingGL(gl);
  16.         }
  17.     });  
  18.         
  19.     mRenderer = new DagRenderer();
  20.     setRenderer(mRenderer);
  21. }

(Where DagRender is my GLSurfaceView.Renderer, and DagGLSurfaceView is my GLSurfaceView)