Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #7454 > unrolled thread
| Started by | "Qu0ll" <Qu0llSixFour@gmail.com> |
|---|---|
| First post | 2011-08-29 16:11 +1000 |
| Last post | 2011-08-29 04:29 -0700 |
| Articles | 5 — 4 participants |
Back to article view | Back to comp.lang.java.programmer
Using encryption with special Unicode characters "Qu0ll" <Qu0llSixFour@gmail.com> - 2011-08-29 16:11 +1000
Re: Using encryption with special Unicode characters Mayeul <mayeul.marguet@free.fr> - 2011-08-29 08:56 +0200
Re: Using encryption with special Unicode characters "Qu0ll" <Qu0llSixFour@gmail.com> - 2011-08-29 17:18 +1000
Re: Using encryption with special Unicode characters Peter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com> - 2011-08-29 00:29 -0700
Re: Using encryption with special Unicode characters Roedy Green <see_website@mindprod.com.invalid> - 2011-08-29 04:29 -0700
| From | "Qu0ll" <Qu0llSixFour@gmail.com> |
|---|---|
| Date | 2011-08-29 16:11 +1000 |
| Subject | Using encryption with special Unicode characters |
| Message-ID | <OvWdnUIcqeRBsMbTnZ2dnUVZ_gCdnZ2d@westnet.com.au> |
This is my first go at using Java encryption. I have a requirement to
encrypt and then later decrypt a series of strings that may contain special
Unicode characters such as "\u25bc". The code below correctly encrypts and
decrypts "normal" ASCII strings but turns characters like "\u25bc" into '?'
when it decrypts (or maybe even when it encrypts).
It doesn't really matter which encryption algorithm I use as long as it is
reasonably secure (I chose AES) but the encryption/decryption process needs
to handle these special characters.
The output from the following code is:
Before char(0): 9660
After char(0): 63
Equal: false
How can I get this to work? Here is the code:
import javax.crypto.Cipher;
import javax.crypto.spec.SecretKeySpec;
public class Encryption {
private static final String ALGORITHM = "AES";
private static final String KEY = "0123456789ABCDEF";
private static final SecretKeySpec KEY_SPEC = new
SecretKeySpec(KEY.getBytes(), ALGORITHM);
private static Cipher cipherEncrypt;
private static Cipher cipherDecrypt;
static {
try {
cipherEncrypt = Cipher.getInstance(ALGORITHM);
cipherEncrypt.init(Cipher.ENCRYPT_MODE, KEY_SPEC);
cipherDecrypt = Cipher.getInstance(ALGORITHM);
cipherDecrypt.init(Cipher.DECRYPT_MODE, KEY_SPEC);
} catch (final Exception e) {
e.printStackTrace();
}
}
public static String decrypt(final byte[] raw) {
String result = null;
try {
result = new String(cipherDecrypt.doFinal(raw));
} catch (final Exception e) {
e.printStackTrace();
}
return result;
}
public static byte[] encrypt(final String raw) {
byte[] result = null;
try {
result = cipherEncrypt.doFinal(raw.getBytes());
} catch (final Exception e) {
e.printStackTrace();
}
return result;
}
public static void main(final String[] args) {
final String before = "\u25bc ABC";
System.out.println("Before char(0): " + (int)before.charAt(0));
final String after = decrypt(encrypt(before));
System.out.println("After char(0): " + (int)after.charAt(0));
System.out.println("Equal: " + before.equals(after));
}
}
--
And loving it,
-Qu0ll (Rare, not extinct)
_________________________________________________
Qu0llSixFour@gmail.com
[Replace the "SixFour" with numbers to email me]
[toc] | [next] | [standalone]
| From | Mayeul <mayeul.marguet@free.fr> |
|---|---|
| Date | 2011-08-29 08:56 +0200 |
| Message-ID | <4e5b3747$0$28393$426a74cc@news.free.fr> |
| In reply to | #7454 |
On 29/08/2011 08:11, Qu0ll wrote:
> This is my first go at using Java encryption. I have a requirement to
> encrypt and then later decrypt a series of strings that may contain
> special Unicode characters such as "\u25bc". The code below correctly
> encrypts and decrypts "normal" ASCII strings but turns characters like
> "\u25bc" into '?' when it decrypts (or maybe even when it encrypts).
>
> It doesn't really matter which encryption algorithm I use as long as it
> is reasonably secure (I chose AES) but the encryption/decryption process
> needs to handle these special characters.
>
> The output from the following code is:
>
> Before char(0): 9660
> After char(0): 63
> Equal: false
>
> How can I get this to work? Here is the code:
>
> import javax.crypto.Cipher;
> import javax.crypto.spec.SecretKeySpec;
>
> public class Encryption {
>
> private static final String ALGORITHM = "AES";
>
> private static final String KEY = "0123456789ABCDEF";
>
> private static final SecretKeySpec KEY_SPEC = new
> SecretKeySpec(KEY.getBytes(), ALGORITHM);
>
> private static Cipher cipherEncrypt;
>
> private static Cipher cipherDecrypt;
>
> static {
> try {
> cipherEncrypt = Cipher.getInstance(ALGORITHM);
> cipherEncrypt.init(Cipher.ENCRYPT_MODE, KEY_SPEC);
> cipherDecrypt = Cipher.getInstance(ALGORITHM);
> cipherDecrypt.init(Cipher.DECRYPT_MODE, KEY_SPEC);
> } catch (final Exception e) {
> e.printStackTrace();
> }
> }
>
> public static String decrypt(final byte[] raw) {
> String result = null;
> try {
> result = new String(cipherDecrypt.doFinal(raw));
> } catch (final Exception e) {
> e.printStackTrace();
> }
>
> return result;
> }
>
> public static byte[] encrypt(final String raw) {
> byte[] result = null;
> try {
> result = cipherEncrypt.doFinal(raw.getBytes());
> } catch (final Exception e) {
> e.printStackTrace();
> }
>
> return result;
> }
>
> public static void main(final String[] args) {
> final String before = "\u25bc ABC";
> System.out.println("Before char(0): " + (int)before.charAt(0));
> final String after = decrypt(encrypt(before));
> System.out.println("After char(0): " + (int)after.charAt(0));
> System.out.println("Equal: " + before.equals(after));
> }
> }
String.getBytes() and String(byte[]), converting String to byte array
and backwise, is the job of a character encoding, which, in Java, are
called 'charsets'. If you do not specify which charset you want to use,
they will use your default charset, which depends on your environment.
This charset is not guaranteed to support Unicode. In fact, in western
environments it is rather likely to be iso-8859-1 or likewise, which
does not support Unicode.
Which is why you're better off forcing the use of a Unicode-compliant
charset, like utf-8. utf-8 and the utf-16s are guaranteed to be
supported by Java, which makes them safe choices.
--
Mayeul
[toc] | [prev] | [next] | [standalone]
| From | "Qu0ll" <Qu0llSixFour@gmail.com> |
|---|---|
| Date | 2011-08-29 17:18 +1000 |
| Message-ID | <EoSdnWYBeIvtoMbTnZ2dnUVZ_t6dnZ2d@westnet.com.au> |
| In reply to | #7455 |
"Mayeul" wrote in message news:4e5b3747$0$28393$426a74cc@news.free.fr... > String.getBytes() and String(byte[]), converting String to byte array > and backwise, is the job of a character encoding, which, in Java, are > called 'charsets'. If you do not specify which charset you want to use, > they will use your default charset, which depends on your environment. > > This charset is not guaranteed to support Unicode. In fact, in western > environments it is rather likely to be iso-8859-1 or likewise, which > does not support Unicode. > > Which is why you're better off forcing the use of a Unicode-compliant > charset, like utf-8. utf-8 and the utf-16s are guaranteed to be > supported by Java, which makes them safe choices. Thanks Mayeul, I now use UTF-8 and it works perfectly :-) -- And loving it, -Qu0ll (Rare, not extinct) _________________________________________________ Qu0llSixFour@gmail.com [Replace the "SixFour" with numbers to email me]
[toc] | [prev] | [next] | [standalone]
| From | Peter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com> |
|---|---|
| Date | 2011-08-29 00:29 -0700 |
| Message-ID | <rZKdndUngJVcosbTnZ2dnUVZ_jCdnZ2d@posted.palinacquisition> |
| In reply to | #7454 |
On 8/28/11 11:11 PM, Qu0ll wrote: > This is my first go at using Java encryption. I have a requirement to > encrypt and then later decrypt a series of strings that may contain > special Unicode characters such as "\u25bc". The code below correctly > encrypts and decrypts "normal" ASCII strings but turns characters like > "\u25bc" into '?' when it decrypts (or maybe even when it encrypts). > > It doesn't really matter which encryption algorithm I use as long as it > is reasonably secure (I chose AES) but the encryption/decryption process > needs to handle these special characters. > > The output from the following code is: > > Before char(0): 9660 > After char(0): 63 > Equal: false > > How can I get this to work? [...] In addition to the reply from Mayeul (which is on the mark), I would offer some more general debugging advice: Your problem can be more easily deciphered if you take the important step of reducing the problem into its component parts. You have two different transformations going on, and of course either of the transformations could be messing things up. The correct first step is to test both the encryption and the character encoding/decoding steps separately. You can test encryption by encrypting a byte array of known values (for general testing, generated pseudo-randomly gives you a "better" test, while even some predefined sequence of bytes would be a good initial test…but for this specific issue, the best test is simply to compare the original byte array you got from the string to the one you get after encrypting and then decrypting that original byte array). And the character encoding/decoding of course can be tested by converting a string to bytes and then back again (specifically, one of the strings you've identified as being problematic). Bottom line: your actual question is somewhat obfuscated by the inclusion of encryption in the question. Encryption doesn't care about characters…it only deals with bytes, so it's practically certain that the encryption aspect is a complete red herring. You can be more effective both at figuring problems out yourself, as well as at posting a true SSCCE and a focused question, if you get into the habit of really narrowing down a problem to its essential part. Pete
[toc] | [prev] | [next] | [standalone]
| From | Roedy Green <see_website@mindprod.com.invalid> |
|---|---|
| Date | 2011-08-29 04:29 -0700 |
| Message-ID | <bhtm571dgtauenant0mtgo3f1qu4lmg2s5@4ax.com> |
| In reply to | #7454 |
On Mon, 29 Aug 2011 16:11:13 +1000, "Qu0ll" <Qu0llSixFour@gmail.com> wrote, quoted or indirectly quoted someone who said : >raw.getBytes() I suspect someday this method will be deprecated. raw.getBytes( encoding ) is what you want since the receiver and sender might not have the same default encoding. Encryption algorinthms concern themselves with bytes. So you want to deal with strings or chars you have to convert them to bytes, encrypt, decrypt than turn them back to strings/chars. The art of converting String to bytes is called encoding, which has nothing to do with encryption. See http://mindprod.com/jgloss/encoding.html You have to know something about the distribution of your characters to choose an optimal encoding. The brute force method is to convert your string into a UTF-16 array of byte pairs. However that more than doubles the size of the encrypted bytes over optimal. You can also compress the string first to get a compact byte string. This takes more computing time, but reduces the size of the encrypted bytes. see http://mindprod.com/jgloss/compression.html -- Roedy Green Canadian Mind Products http://mindprod.com The modern conservative is engaged in one of man's oldest exercises in moral philosophy; that is, the search for a superior moral justification for selfishness. ~ John Kenneth Galbraith (born: 1908-10-15 died: 2006-04-29 at age: 97)
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.java.programmer
csiph-web