Home > Snippets > Fixed YUV to RGB conversion, optimized ARM assembler code (iPhone)

Fixed YUV to RGB conversion, optimized ARM assembler code (iPhone)

February 6th, 2009

New version of optimized ARM code for image conversion (YUV to RGB) now fix some colour problem.

You can download the code here

Snippets

  1. loong.wu
    | #1

    I call this function to be wrong “EXC_BAD_ACCESS”, parameter settings should be correct,link this :
    mb_YUV420_to_RGB32(320<<16|240, 320, (char**)avpict.data, (char*)dst);
    size of dst : 320*240*4

    I hope you can give an example of code.

  2. | #2

    The code is wrong cause you pass “320″ as second parameter. The original YUV frame usually has a different size probably because of codec requirements. We use something like this

    mb_YUV420_to_RGB32(((codecCtx->width < < 16) | (codecCtx->height)), frame->linesize[0], (char **)frame->data, (char *) aBuffer);

  3. Francisco
    | #3

    Hi,

    Do you have the code for the opposite: RGB32 to YUV420P ?

    Thanks

  4. | #4

    I tested it on iphone, and it works.
    but, i found it’s slower than libswscale(no asm optimization).

  5. | #5

    @Francisco : No, we have no code for the opposite at the moment.
    @luke : Could you send us a sample of the video you are trying to process? We didn’t find libswscale to be faster (not to mention the license which doesn’t allow it to be used in commercial applications)

  6. | #6

    @michele
    i tried a few 320×240 wmv videos, and libswscale ‘s average processing time is 0.004s, and this code’s average process time is 0.006s.

  7. loong.wu
    | #7

    michele,thank you
    and,i have another question about the arm asm, i hope you can help solve the
    code:
    ———————————————————-
    .align 5

    function put_pixels8_xy2_arm, export=1

    @ void func(uint8_t block, const uint8_t pixels, int line_size, int h) @ block = word aligned, pixles = unaligned pld r1? stmfd sp!, {r4-r11,lr} @ R14 is also called LR

    adr r12, 5f

    JMP_ALIGN r5, r1

    1:

    RND_XY2_EXPAND 0, lsl

    .align 5

    2:

    RND_XY2_EXPAND 1, lsl

    .align 5

    3:

    RND_XY2_EXPAND 2, lsl

    .align 5

    4:

    RND_XY2_EXPAND 3, lsl

    .align 5

    5:

    .long 0×03030303 .long 0xFCFCFCFC >> 2 .long 0x0F0F0F0F

    .align 5

    function put_no_rnd_pixels8_xy2_arm, export=1

    @ void func(uint8_t block, const uint8_t pixels, int line_size, int h) @ block = word aligned, pixles = unaligned pld r1? stmfd sp!, {r4-r11,lr} @ R14 is also called LR

    adr r12,5f

    JMP_ALIGN r5, r1

    1:

    RND_XY2_EXPAND 0, lsr

    .align 5

    2:

    RND_XY2_EXPAND 1, lsr

    .align 5

    3:

    RND_XY2_EXPAND 2, lsr

    .align 5

    4:

    RND_XY2_EXPAND 3, lsr

    .align 5

    5:

    .long 0×03030303

    .long 0xFCFCFCFC >> 2 .long 0x0F0F0F0F @.endfunc
    ——————————————————————
    error output:
    adr r12,5f invalid constant (4d4)

    I try to “adr r12, 5f” modified to “adr r12, 5b”, compiled successfully Why is it so ??? I really do not understand arm asm but,i need these file.(ffmpeg4iphone of several arm asm files)

    Thanks again

  8. Francisco
    | #9

    @michele do have plans to implement such code?

  9. | #10

    @Francisco This is not planned at the moment.

  10. loong.wu
    | #11

    I want to rendering these RGB images on the screen, what is the quickest way,opengles or core animation or other

  11. Yonas
    | #12

    What’s the conclusion…libswscale is faster than arm assembly?? That’s got to be some code coding :)

  12. | #13

    If you’re interested in code that does ARM YUV 2 RGB faster than this, consider http://www.wss.co.uk/pinknoise/yuv2rgb – the code there is released under the GNU GPL, but if that’s not suitable for you, contact me and we can discuss it.

  13. | #14

    This guy is really lucky !

  14. Keon
    | #15

    Hi.
    for me, mb_YUV420_to_RGB32 is faster than swscale. but I am a beginner of opengl/es , so I could not watch the normal picture yet.
    I tried,
    // image size 576 x 320
    kkk = (char*)malloc(576 * 320 * 4);
    mb_YUV420_to_RGB32(((576 <linesize[0], (char**)avFrame1->data, kkk);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, 256, 256, 0, GL_RGBA, GL_UNSIGNED_BYTE, kkk);

    [context1 presentRenderbuffer:GL_RENDERBUFFER_OES];

    ==> shown some picture but not normal
    I tried following for same source.

    kkk = (char*)malloc(576 * 320 * 4);
    mb_YUV420_to_RGB32(((256 <linesize[0], (char**)avFrame1->data, kkk);
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, 256, 256, 0, GL_RGBA, GL_UNSIGNED_BYTE, kkk);

    ==> some part was magnified , but not colorful.
    How can I use the for glTexImage2D ??? Please help.

  15. Keon
    | #16

    it’s my mistake . paste and copy bug…
    I wrote
    mb_YUV420_to_RGB32( (576 <linesize[0], (char**)avFrame->data, kkk);

    and glTexImage2D ‘s width and height have only power of two.

    only mb_** (256<<16 | 256 …)
    glTexImage2D(.. 256, 256, …) works.
    but it was not original size.
    how can I show my images. (ofcourse I have used ffmpeg's sws_scale)
    Please help me. and I am sorry for my poor english.
    Keon

  16. Keon
    | #17

    I don’t understand, I wrote again my question. but it does not show correctly.
    I used 5 7 6 < < 16 | 3 2 0.

  17. ppnext
    | #18

    Hi, I want to use these code in my project.
    but got these error:

    no such instruction: `stmdb sp!, {r4,r5,r6,r7,r8,r9,r10,r11,r12,lr}’
    no such instruction: `uxth r4, r0,ror’
    no such instruction: `uxth r5, r0′

  1. No trackbacks yet.