Buy Sell Discuss UK Domain Names at AcornDomains.co.uk

Today's Drop Dates are: 07-11-2011 or 14-11-2011   All times are GMT. The time now is 07:55:56 PM.
Domain Name Sales Domain Software Calculate UK Domain Drop Dates Domain Registration NameDrive Domain Parking Subscribe to our Domains For Sale newsletter
Go Back   Domain Forum Acorn Domains Buy Sell Auction UK Domains > Website Design and Promotion > Website Design > Scripts and Coding
Connect with Facebook

Scripts and Coding PHP, MySQL, scripts

Closed Thread
 
LinkBack Thread Tools Display Modes
Old 01-02-2009, 04:57:05 AM     #1 (permalink)

 
woopwoop's Avatar
 
Join Date: Jan 2007
Posts: 1,483
woopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond repute

uft-8 encoding question

Really starnge problem that I'm having:

I've different pages some with Japanese, some with French characters (never mixed - either 1 or the other)

Different parts of the pages are displaying the characters correctly and in other parts, some of the characters are replaced with ??? or ??? in diamonds.

I have this in the head (which is actually a php included file):
Quote:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
Which I thought was enough.

I think because it's an included file theres an issue with the server default encoding taking over and messing up havf of the characters...

Just not sure if it is a server encoding issue and why only half of the characters are affected

Has anyone experienced anything like this?
woopwoop is offline  
Old 01-02-2009, 05:54:45 AM     #2 (permalink)

 
Skinner's Avatar
 
Join Date: Jul 2008
Location: Manchester
Posts: 2,501
Skinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond repute

Are you sure you havent inserted incorrectly encoded data into the database or have the database encoding set differently than the output encoding ?

One common issue I see alot is people who have wrote their content on Microsoft word and the strange characters magically appearing that Microshaft call smart quotes.
__________________
Browse:
Skinner is offline  
Old 01-02-2009, 06:35:54 AM     #3 (permalink)

 
woopwoop's Avatar
 
Join Date: Jan 2007
Posts: 1,483
woopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond repute

The text isn't from a database.
In places it's an echo'ed variable, other places its straight text.

The problem seems to happen when the page in question gets its header from an included file:

Quote:
include "header.php";
Even though the included header has:

Quote:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
I think that my charset is being ignored and that the server is using it's own - really strange - I can't override it with the things I've tried in my htaccess, or the php.ini that I created in the directory (I can't change the real php.ini on the server as it's a shared hosting plan)

Really odd. I'm waiting for a reply from support
woopwoop is offline  
Old 01-02-2009, 06:49:23 AM     #4 (permalink)

 
woopwoop's Avatar
 
Join Date: Jan 2007
Posts: 1,483
woopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond repute

Yep, this is exactly what's happening (what I mentioned above) - really odd.

Quote:
主題を拾い読みしなさい
^ this is some Japanese.

If I put it in the header.php file and then load the page which includes the header, it shows perfectly.

If I put it in the index.php (which includes the header.php file <- this includes the encoding information) then the Japanese displays as ?????

So I need to either get the server default to be UTF-8 or load my headers sooner (before the file include)
woopwoop is offline  
Old 01-02-2009, 07:54:02 AM     #5 (permalink)

 
Edwin's Avatar
 
Join Date: Apr 2005
Location: Cambridge, UK
Posts: 3,876
Edwin has a reputation beyond reputeEdwin has a reputation beyond reputeEdwin has a reputation beyond reputeEdwin has a reputation beyond reputeEdwin has a reputation beyond reputeEdwin has a reputation beyond reputeEdwin has a reputation beyond reputeEdwin has a reputation beyond reputeEdwin has a reputation beyond reputeEdwin has a reputation beyond reputeEdwin has a reputation beyond repute

Perhaps you're not saving the main file in UTF-8 format from the text editor, or whatever you used to make it? If you save it in ASCII or another format, it could turn out like you posted.
__________________
Memorable Domains Ltd - Over 7,000 descriptive, generic .co.uk domains for sale
Please note: All sale prices over a week old are automatically invalid. No exceptions. Thanks!
Edwin is offline  
Old 01-02-2009, 08:16:14 AM     #6 (permalink)

 
woopwoop's Avatar
 
Join Date: Jan 2007
Posts: 1,483
woopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond repute

Thanks for your help Edwin and Skinner - appreciate it.

Edwin I don't think that the issue is how I'm saving it because if I use the same file editor and save in the same way I can get a success and a failur depending on if the Japanese goes in the header (where the encoding is specified) or in the index (where the encoding info is php included in a header).

So I'm sure it's that the included file's encoding instruction is ignored if the Japanese (or French for that matter) is located in the index file (and not in the header).



I made some progress with my host. They gave me a php.ini file on my server. I see the following in the file:

Quote:
; As of 4.0b4, PHP always outputs a character encoding by default in
; the Content-type: header. To disable sending of the charset, simply
; set it to be empty.
;
; PHP's built-in default is text/html
default_mimetype = "text/html"
;default_charset = "iso-8859-1"
so the default_charset is commented out with the preceding ";"
If I remove the ";" my french and german pages output correctly, but the Japanese goes even crazier.

I tried making the default_charset = "UTF-8"
but the french, german and japanese are all displayed with errors.

There may be more things I need to change in the php.ini but this issue seems to be a bigger problem in general. Not sure if it's limited to my hosts setup, or if it's a php problem that encoding info isn't transfered when the page header is part of a php include.
woopwoop is offline  
Old 01-02-2009, 03:41:57 PM     #7 (permalink)

 
Skinner's Avatar
 
Join Date: Jul 2008
Location: Manchester
Posts: 2,501
Skinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond repute

have you tried editting the page header, not the include the actual apache header ?

header('Content-type: text/html; charset=utf-8');
__________________
Browse:
Skinner is offline  
Old 03-02-2009, 01:40:24 AM     #8 (permalink)

 
woopwoop's Avatar
 
Join Date: Jan 2007
Posts: 1,483
woopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond reputewoopwoop has a reputation beyond repute

Skinner & Edwin - thanks for your help - I've now kinda fixed the problem (just have other issues )

Edwin you were right about how the file was saved. I was assuming that a language pack that came with some software would have the French/German/Japanese files saved as UTF-8 - but they weren't

Also my server needed the php.ini settings had to be tweaked.


Was wondering if you knew about preg_replace and safely displaying Japanese characters on a page.

Quote:
$translated = preg_replace("/[^a-z âàéèëÉÂÀËçÇ \d]/i", " ", $input);

I'm using code like this to only allow a-z, french accented characters and spaces, other characters translated to a space.

1. Is this a safe way to go about this? Or can quotes and other characters still get through?

2. What do I need to put in there to allow Japanese characters (Kanji I think)? And will it be safe or open up the risk or quotes and other code getting through?


Really appreciate your help. Skinner you always try and answer my coding questions and Edwin you spurred me to double checking the language file encoding on this - thanks again.
woopwoop is offline  
Old 03-02-2009, 02:12:58 AM     #9 (permalink)

 
Skinner's Avatar
 
Join Date: Jul 2008
Location: Manchester
Posts: 2,501
Skinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond repute

You would be better using graphics made with GD or ImageMagick to display the Kanji.

You can make quotes etc safe using the htmlentities() functions, which converts < > into &gt; / &lt; and " into &quote; stuff like that. Stops nasties being added in.

There is also utf8_encode and utf8_decode that may help too.

My RegEx sucks major ass but I'm pretty sure you need to add \i to make it case insensitive, otherwise your only going to match lowercase letters
__________________
Browse:
Skinner is offline  
Old 03-02-2009, 02:15:30 AM     #10 (permalink)

 
Skinner's Avatar
 
Join Date: Jul 2008
Location: Manchester
Posts: 2,501
Skinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond reputeSkinner has a reputation beyond repute

I saw this Character list on another website of what you need to allow

ÀàÁáÂâÃãÄäÅ寿ÇçÈèÉéÊêËëÌìÍíÎîÏïÐðÑñÒòÓóÔôÕõÖöØøÙù ÚúÛûÜüÝýÞþœŒ
__________________
Browse:
Skinner is offline  
Closed Thread



Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off

Similar Threads
Thread Thread Starter Domain Name Community Replies Last Post
A question from buy.at Tyson Pearcey Affiliate Marketing 21 05-02-2008 03:26:35 PM
123-Reg noddy question bb99 Domain Name Registrars 3 20-02-2007 01:59:18 PM
Google Network & Parking Programs Question? sneezycheese Internet Marketing 0 08-10-2006 10:35:58 AM
Question about DRS etc Brassneck Domain Name Disputes 2 10-10-2005 05:13:34 PM




All times are GMT. The time now is 07:55:56 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.6.0 RC 2
All content on Acorn Domains is member generated and is not moderated before posting. All content is viewed and used by you at your own risk and AD does not warrant the accuracy or reliability of any of the information. The views expressed are those of the individual contributors and not necessarily those of AD. Please contact us to report any issues or send a PM to "Admin".