Nice! But it can be done faster if original data is in RAM:
    ;------------------------------------------------------
    ; Input: hl = Pointer to data
    ; Output: d = high nibble, e = low nibble
    ; Modifies: a, d, e
    ;------------------------------------------------------
    GET_NIBBLES:
    xor a
    rrd
    ld e,a
    rrd
    ld d,a
    rrd
    ret
Ideal to use HL as a pointer to unpack the byte-packed data and BC as a counter. What do you think? 

Of course if data is not in RAM you can add this lines
    ;------------------------------------------------------
    ; Input: a = data
    ; Output: d = high nibble, e = low nibble
    ; Modifies: a, de, hl
    ;------------------------------------------------------
    GET_NIBBLESROM:
    ld hl,FREEBYTEINRAM ; you need to reserve a byte in ram to do this
    ld [hl],a
    jp GET_NIBBLES ; or place the appropriate code here