Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #7454 > unrolled thread

Using encryption with special Unicode characters

Started by"Qu0ll" <Qu0llSixFour@gmail.com>
First post2011-08-29 16:11 +1000
Last post2011-08-29 04:29 -0700
Articles 5 — 4 participants

Back to article view | Back to comp.lang.java.programmer


Contents

  Using encryption with special Unicode characters "Qu0ll" <Qu0llSixFour@gmail.com> - 2011-08-29 16:11 +1000
    Re: Using encryption with special Unicode characters Mayeul <mayeul.marguet@free.fr> - 2011-08-29 08:56 +0200
      Re: Using encryption with special Unicode characters "Qu0ll" <Qu0llSixFour@gmail.com> - 2011-08-29 17:18 +1000
    Re: Using encryption with special Unicode characters Peter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com> - 2011-08-29 00:29 -0700
    Re: Using encryption with special Unicode characters Roedy Green <see_website@mindprod.com.invalid> - 2011-08-29 04:29 -0700

#7454 — Using encryption with special Unicode characters

From"Qu0ll" <Qu0llSixFour@gmail.com>
Date2011-08-29 16:11 +1000
SubjectUsing encryption with special Unicode characters
Message-ID<OvWdnUIcqeRBsMbTnZ2dnUVZ_gCdnZ2d@westnet.com.au>
This is my first go at using Java encryption.  I have a requirement to 
encrypt and then later decrypt a series of strings that may contain special 
Unicode characters such as "\u25bc".  The code below correctly encrypts and 
decrypts "normal" ASCII strings but turns characters like "\u25bc" into '?' 
when it decrypts (or maybe even when it encrypts).

It doesn't really matter which encryption algorithm I use as long as it is 
reasonably secure (I chose AES) but the encryption/decryption process needs 
to handle these special characters.

The output from the following code is:

Before char(0): 9660
After char(0): 63
Equal: false

How can I get this to work?  Here is the code:

import javax.crypto.Cipher;
import javax.crypto.spec.SecretKeySpec;

public class Encryption {

    private static final String ALGORITHM = "AES";

    private static final String KEY = "0123456789ABCDEF";

    private static final SecretKeySpec KEY_SPEC = new 
SecretKeySpec(KEY.getBytes(), ALGORITHM);

    private static Cipher cipherEncrypt;

    private static Cipher cipherDecrypt;

    static {
        try {
            cipherEncrypt = Cipher.getInstance(ALGORITHM);
            cipherEncrypt.init(Cipher.ENCRYPT_MODE, KEY_SPEC);
            cipherDecrypt = Cipher.getInstance(ALGORITHM);
            cipherDecrypt.init(Cipher.DECRYPT_MODE, KEY_SPEC);
        } catch (final Exception e) {
            e.printStackTrace();
        }
    }

    public static String decrypt(final byte[] raw) {
        String result = null;
        try {
            result = new String(cipherDecrypt.doFinal(raw));
        } catch (final Exception e) {
            e.printStackTrace();
        }

        return result;
    }

    public static byte[] encrypt(final String raw) {
        byte[] result = null;
        try {
            result = cipherEncrypt.doFinal(raw.getBytes());
        } catch (final Exception e) {
            e.printStackTrace();
        }

        return result;
    }

    public static void main(final String[] args) {
        final String before = "\u25bc ABC";
        System.out.println("Before char(0): " + (int)before.charAt(0));
        final String after = decrypt(encrypt(before));
        System.out.println("After char(0): " + (int)after.charAt(0));
        System.out.println("Equal: " + before.equals(after));
    }
}

--
And loving it,

-Qu0ll (Rare, not extinct)
_________________________________________________
Qu0llSixFour@gmail.com
[Replace the "SixFour" with numbers to email me] 

[toc] | [next] | [standalone]


#7455

FromMayeul <mayeul.marguet@free.fr>
Date2011-08-29 08:56 +0200
Message-ID<4e5b3747$0$28393$426a74cc@news.free.fr>
In reply to#7454
On 29/08/2011 08:11, Qu0ll wrote:
> This is my first go at using Java encryption. I have a requirement to
> encrypt and then later decrypt a series of strings that may contain
> special Unicode characters such as "\u25bc". The code below correctly
> encrypts and decrypts "normal" ASCII strings but turns characters like
> "\u25bc" into '?' when it decrypts (or maybe even when it encrypts).
>
> It doesn't really matter which encryption algorithm I use as long as it
> is reasonably secure (I chose AES) but the encryption/decryption process
> needs to handle these special characters.
>
> The output from the following code is:
>
> Before char(0): 9660
> After char(0): 63
> Equal: false
>
> How can I get this to work? Here is the code:
>
> import javax.crypto.Cipher;
> import javax.crypto.spec.SecretKeySpec;
>
> public class Encryption {
>
> private static final String ALGORITHM = "AES";
>
> private static final String KEY = "0123456789ABCDEF";
>
> private static final SecretKeySpec KEY_SPEC = new
> SecretKeySpec(KEY.getBytes(), ALGORITHM);
>
> private static Cipher cipherEncrypt;
>
> private static Cipher cipherDecrypt;
>
> static {
> try {
> cipherEncrypt = Cipher.getInstance(ALGORITHM);
> cipherEncrypt.init(Cipher.ENCRYPT_MODE, KEY_SPEC);
> cipherDecrypt = Cipher.getInstance(ALGORITHM);
> cipherDecrypt.init(Cipher.DECRYPT_MODE, KEY_SPEC);
> } catch (final Exception e) {
> e.printStackTrace();
> }
> }
>
> public static String decrypt(final byte[] raw) {
> String result = null;
> try {
> result = new String(cipherDecrypt.doFinal(raw));
> } catch (final Exception e) {
> e.printStackTrace();
> }
>
> return result;
> }
>
> public static byte[] encrypt(final String raw) {
> byte[] result = null;
> try {
> result = cipherEncrypt.doFinal(raw.getBytes());
> } catch (final Exception e) {
> e.printStackTrace();
> }
>
> return result;
> }
>
> public static void main(final String[] args) {
> final String before = "\u25bc ABC";
> System.out.println("Before char(0): " + (int)before.charAt(0));
> final String after = decrypt(encrypt(before));
> System.out.println("After char(0): " + (int)after.charAt(0));
> System.out.println("Equal: " + before.equals(after));
> }
> }

String.getBytes() and String(byte[]), converting String to byte array 
and backwise, is the job of a character encoding, which, in Java, are 
called 'charsets'. If you do not specify which charset you want to use, 
they will use your default charset, which depends on your environment.

This charset is not guaranteed to support Unicode. In fact, in western 
environments it is rather likely to be iso-8859-1 or likewise, which 
does not support Unicode.

Which is why you're better off forcing the use of a Unicode-compliant 
charset, like utf-8. utf-8 and the utf-16s are guaranteed to be 
supported by Java, which makes them safe choices.

--
Mayeul

[toc] | [prev] | [next] | [standalone]


#7456

From"Qu0ll" <Qu0llSixFour@gmail.com>
Date2011-08-29 17:18 +1000
Message-ID<EoSdnWYBeIvtoMbTnZ2dnUVZ_t6dnZ2d@westnet.com.au>
In reply to#7455
"Mayeul"  wrote in message news:4e5b3747$0$28393$426a74cc@news.free.fr... 

> String.getBytes() and String(byte[]), converting String to byte array 
> and backwise, is the job of a character encoding, which, in Java, are 
> called 'charsets'. If you do not specify which charset you want to use, 
> they will use your default charset, which depends on your environment.
>
> This charset is not guaranteed to support Unicode. In fact, in western 
> environments it is rather likely to be iso-8859-1 or likewise, which 
> does not support Unicode.
>
> Which is why you're better off forcing the use of a Unicode-compliant 
> charset, like utf-8. utf-8 and the utf-16s are guaranteed to be 
> supported by Java, which makes them safe choices.

Thanks Mayeul, I now use UTF-8 and it works perfectly :-)

--
And loving it,

-Qu0ll (Rare, not extinct)
_________________________________________________
Qu0llSixFour@gmail.com
[Replace the "SixFour" with numbers to email me]

[toc] | [prev] | [next] | [standalone]


#7457

FromPeter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com>
Date2011-08-29 00:29 -0700
Message-ID<rZKdndUngJVcosbTnZ2dnUVZ_jCdnZ2d@posted.palinacquisition>
In reply to#7454
On 8/28/11 11:11 PM, Qu0ll wrote:
> This is my first go at using Java encryption. I have a requirement to
> encrypt and then later decrypt a series of strings that may contain
> special Unicode characters such as "\u25bc". The code below correctly
> encrypts and decrypts "normal" ASCII strings but turns characters like
> "\u25bc" into '?' when it decrypts (or maybe even when it encrypts).
>
> It doesn't really matter which encryption algorithm I use as long as it
> is reasonably secure (I chose AES) but the encryption/decryption process
> needs to handle these special characters.
>
> The output from the following code is:
>
> Before char(0): 9660
> After char(0): 63
> Equal: false
>
> How can I get this to work? [...]

In addition to the reply from Mayeul (which is on the mark), I would 
offer some more general debugging advice:

Your problem can be more easily deciphered if you take the important 
step of reducing the problem into its component parts.  You have two 
different transformations going on, and of course either of the 
transformations could be messing things up.

The correct first step is to test both the encryption and the character 
encoding/decoding steps separately.  You can test encryption by 
encrypting a byte array of known values (for general testing, generated 
pseudo-randomly gives you a "better" test, while even some predefined 
sequence of bytes would be a good initial test…but for this specific 
issue, the best test is simply to compare the original byte array you 
got from the string to the one you get after encrypting and then 
decrypting that original byte array).  And the character 
encoding/decoding of course can be tested by converting a string to 
bytes and then back again (specifically, one of the strings you've 
identified as being problematic).

Bottom line: your actual question is somewhat obfuscated by the 
inclusion of encryption in the question.  Encryption doesn't care about 
characters…it only deals with bytes, so it's practically certain that 
the encryption aspect is a complete red herring.  You can be more 
effective both at figuring problems out yourself, as well as at posting 
a true SSCCE and a focused question, if you get into the habit of really 
narrowing down a problem to its essential part.

Pete

[toc] | [prev] | [next] | [standalone]


#7462

FromRoedy Green <see_website@mindprod.com.invalid>
Date2011-08-29 04:29 -0700
Message-ID<bhtm571dgtauenant0mtgo3f1qu4lmg2s5@4ax.com>
In reply to#7454
On Mon, 29 Aug 2011 16:11:13 +1000, "Qu0ll" <Qu0llSixFour@gmail.com>
wrote, quoted or indirectly quoted someone who said :

>raw.getBytes()
I suspect someday this method will be deprecated.

 raw.getBytes( encoding ) 

is what you want since the receiver and sender might not have the same
default encoding.

Encryption algorinthms concern themselves with bytes.  So you want  to
deal with strings or chars you have to convert them to bytes, encrypt,
decrypt than turn them back to strings/chars.

The art of converting String to bytes is called encoding, which has
nothing to do with encryption.

See http://mindprod.com/jgloss/encoding.html

You have to know something about the distribution of your characters
to choose an optimal encoding.  The brute force method is to convert
your string into a UTF-16 array of byte pairs.  However that more than
doubles the size of the encrypted bytes over optimal.

You can also compress the string first to get a compact byte string.
This takes more computing time, but reduces the size of the encrypted
bytes.
see http://mindprod.com/jgloss/compression.html
-- 
Roedy Green Canadian Mind Products
http://mindprod.com
The modern conservative is engaged in one of man's oldest exercises in moral philosophy; that is, 
the search for a superior moral justification for selfishness.
~ John Kenneth Galbraith (born: 1908-10-15 died: 2006-04-29 at age: 97) 

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.java.programmer


csiph-web