Partitions created by the installer/diskdrake are not aligned properly on 4K drives, causing very bad performance ( https://ata.wiki.kernel.org/index.php/ATA_4_KiB_sector_issues ). This issue has been fixed in parted/fdisk, but not in diskdrake as it does all the partitioning itself. It is quite unfortunate as other major OSes and distros have handled this correctly for quite some time now. The "quick fix" that makes most cases work and should be done in any case is to align partitions to 1MB boundaries by default. Additional lower-priority things to take into account are - drives configured with a jumper that require off-by-one alignment of the partitions, and - drives with logical non-512 sector sizes (mostly usb sticks as of now) where diskdrake currently creates too large partitions as it assumes sector size 512. Related Mandriva bug reports: https://qa.mandriva.com/show_bug.cgi?id=58071 https://qa.mandriva.com/show_bug.cgi?id=46774
Created attachment 594 [details] Quick fix to align partitions to start on 1MB boundaries Here's a small patch that changes diskdrake to align partitions to 1MB boundaries instead of involving head/cylinder boundaries. This is what other OSes (Linux/Windows) do, and fixes most cases where we don't align partitions properly. Partition endings could probably also be optimized, but that is lower-priority because it doesn't affect performance like partition starting positions. Pascal, could you maybe take a look in this one?
Created attachment 595 [details] WIP patch for non-512-byte logical sector sizes Full solution to all of the issues firstly requires us to handle non-512-bytes logical sector sizes (some USB drives apparently already exist according to the Mandriva bug reports). Unfortunately it is not as easy as it seems; the 512-byte sector assumptions exist everywhere drakx handles partitions, and that is a lot of code. I began adding support for those, but I didn't get it done yet, and I don't think I'm going to continue it anytime soon, so attached is a WIP patch in case someone who wants to do the work finds it useful.
Keywords: (none) => PATCHCC: (none) => thierry.vignaudAssignee: bugsquad => pterjan
For the first patch: Why don't you do the align on Mb part in adjustStart() ? It would more clean & more readable
Because raw::adjustStart() is overridden by gpt.pm, mac.pm, sun.pm.
Then why not just create some helper rather than inline it in adjustStartAndEnd()
Ansi, can you just commit your first patch? Thus we'll got basic test in cauldron.
*** Bug 2037 has been marked as a duplicate of this bug. ***
CC: (none) => magnus.mud
OK, I can make it a separate function then. I'll probably apply it in SVN tomorrow when I'm less tired. Sorry for the delay.
CC: (none) => maarten.vanraesBlocks: (none) => 1994
Done, r1844.
About the second patch, it would be better to rename functions whose behavior changes (such as MB()) in order to catch every caller. But it might just be better to revice pixel/pterjan work for leveraging libparted (main issue was parted behaving like fdisk on partition tables it doesn't parse well, showing zeroed part table.
pinging. because nothing happened to this report since more than 3 months ago, and it still has the status NEW or REOPENED @ Pascal Please set status to ASSIGNED if you think this bug was assigned correctly. If for work flow reasons you can't do that, then please put OK on the whiteboard instead.
CC: (none) => marja11
Hello this one is resolved no ?
Ping ?
(In reply to comment #11) > But it might just be better to revice pixel/pterjan work for leveraging > libparted (main issue was parted behaving like fdisk on partition tables it > doesn't parse well, showing zeroed part table. Not only From what I remember, error handling in libparted is awful (you need to set an exception handler which will get called for any kind of error with a translated error string, and parse that string). Default handler displaying the error on the console.
Hi, This bug was filed against cauldron, but we do not have cauldron at the moment. Please report whether this bug is still valid for Mageia 2. Thanks :) Cheers, marja
Keywords: (none) => NEEDINFO
(In reply to comment #15) > (In reply to comment #11) > > But it might just be better to revice pixel/pterjan work for leveraging > > libparted (main issue was parted behaving like fdisk on partition tables it > > doesn't parse well, showing zeroed part table. > > Not only > > From what I remember, error handling in libparted is awful (you need to set an > exception handler which will get called for any kind of error with a translated > error string, and parse that string). Default handler displaying the error on > the console. that was on may 17th, so no chance this got fixed
Keywords: NEEDINFO => (none)Whiteboard: (none) => (Mga2)
Please look at the bottom of this mail to see whether you're the assignee of this bug, if you don't already know whether you are. If you're the assignee: We'd like to know for sure whether this bug was assigned correctly. Please change status to ASSIGNED if it is, or put OK on the whiteboard instead. If you don't have a clue and don't see a way to find out, then please put NEEDHELP on the whiteboard. Please assign back to Bug Squad or to the correct person to solve this bug if we were wrong to assign it to you, and explain why. Thanks :) **************************** @ the reporter and persons in the cc of this bug: If you have any new information that wasn't given before (like this bug being valid for another version of Mageia, too, or it being solved) please tell us. @ the reporter of this bug If you didn't reply yet to a request for more information, please do so within two weeks from now. Thanks all :-D
I just got a new SSD, and it is my understanding that the 25nm process chips like in the Vertex3 120GB line use a 2MB (rather than 1MB, like the quick-fix mentions) erase block size. So, for me, I need alignment on 2MB boundaries. LVM is already 4MB extents by default, so that's not an issue, but filesystems themselves also need to be tuned appropriately. For ext4 (other modern FS work similarly), I need to use something like mkfs.ext4 -E stride=512 -E stripe-width=512 /dev/sdf1, but raid users need similar inputs to avoid hitting unnecessary disks. I don't see how this could all be easily calculated for the raid case, so it might make more sense to allow setting these via an advanced tab or something.
CC: (none) => rick
Documentation team should write about this issue in the documentation about diskdrake (including in installer) if it didn't/doesn't get fixed. Is there any news?
CC: (none) => doc-bugsTarget Milestone: --- => Mageia 3
@ tmb If this didn't get fixed yet, could gparted then be included on both LiveDVD's? (I didn't see it on the 3alpha3's) Docteam started advising users who want to install on a SSD to use a tool like gparted for the partitioning part, and not all users will like a cli tool for that. http://docteam.mageia.nl/installer/content/doPartitionDisks.html
BTW, I think this migh break windows booting after resizing (two people reported to me windows failed to boot after resizing it for installing Mageia)
(In reply to comment #22) > BTW, I think this might break windows booting after resizing (two people > reported to me windows failed to boot after resizing it for installing Mageia) Increasing severity Should priority be increased, too?
Whiteboard: (Mga2) => 3alpha3 (Mga2)
Whiteboard: 3alpha3 (Mga2) => 3beta1 (Mga2)
Thanks to https://forums.mageia.org/en/viewtopic.php?f=15&t=4097&p=29407#p29407, I learned I misunderstood this bug. I had always thought that Anssi committed fixes, but that nothing was submitted because of some error handling issue. Sorry about that. Tbh, I don't understand what is left of this bug (some error handling issue - comment 15 - ...what impact does that have?) And I don't understand tv's last comment either. (In reply to comment #22) > BTW, I think this migh break windows booting after resizing (two people > reported to me windows failed to boot after resizing it for installing Mageia) What is "this" ? I don't have a clear view of the current situation. Is there anything documentation team should say in installer diskdrake help or MCC diskdrake help? If so, what?
hmm, i was under the impression it was fixed...
Here is what Mageia 3 beta 2 did on my last install to SSD # fdisk -l -u /dev/sda Disk /dev/sda: 128.0 GB, 128035676160 bytes, 250069680 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x0007007a Device Boot Start End Blocks Id System /dev/sda1 * 2048 39166469 19582211 83 Linux /dev/sda2 39169998 250067789 105448896 5 Extended /dev/sda5 39170048 48387779 4608866 83 Linux /dev/sda6 48390144 56564864 4087360+ 82 Linux swap / Solaris /dev/sda7 56567808 250067789 96749991 83 Linux All start sectors are divisible by 8 and so they are correctly aligned to 2M bounaries. The exception is the extended partition, but I think that does not matter. Correct?
CC: (none) => derekjenn
Thanks, Derek :) So the bug is fixed in cauldron (unless that exception for the extended partition /does/ matter), only maybe not for windows resizing. If it is reported again, it would be good to have the output of # fdisk -l -u /dev/sda for two partitioned SSD disks: one where the resizing was done from within windows, and the other where it was done by diskdrake, and where starting windows failed.
Keywords: PATCH => NEEDINFOSummary: Partitioning issues with 4k drives ("advanced format") => Resizing Windows partition on SSD might lead to unbootable windows.
Correction to comment 26 All the partitions start sector are divisible by 2048 so the partitions are aligned on 1MB boundaries This is OK for most but not all SSD's See Comment 19
(In reply to comment #28) > Correction to comment 26 > > All the partitions start sector are divisible by 2048 so the partitions are > aligned on 1MB boundaries > > This is OK for most but not all SSD's See Comment 19 1024 KiB = 1 MiB 2048 KiB = 2 MiB I'd say that is 2 MiB boundaries, What do I miss? There is more then 2MiB between partitions (only sda5 starts close to the beginning of the Extended partition it is in) About windows, I just heard that Windows puts its Master File Table at the end of the windows partition. So if that partition is made smaller to free up space for Mageia, the MTF should somehow be moved to the new Windows partition end :/
OOPS You're right, it is not KiB, but only half a KiB
CC: (none) => anssi.hannula
@ Anssi, Sorry for making a mess of this bug (I don't feel too guilty, though, because I had asked for clarification in comment 24 :þ ) Feel free to revert the change I made to the summary of this bug, if that is better. And about being clear: in comment 30 I talked about sector size.
(In reply to Richard Houser from comment #19) > I just got a new SSD, and it is my understanding that the 25nm process chips > like in the Vertex3 120GB line use a 2MB (rather than 1MB, like the > quick-fix mentions) erase block size. So, for me, I need alignment on 2MB > boundaries. Do you have a link to any documentation confirming that it's using a 2MB erase block size? All documentation I've found for ssd drives indicate they use from 16 to 512 kb erase blocks. Newer hard drives are using 4kb sector physical sector sizes. The 1MB alignment was chosen, as all of the erase block sizes, and 512 byte, and 4kb sectors will all fit an exact number in a 1MB area.
CC: (none) => davidwhodgins
(In reply to Dave Hodgins from comment #32) > (In reply to Richard Houser from comment #19) > > I just got a new SSD, and it is my understanding that the 25nm process chips > > like in the Vertex3 120GB line use a 2MB (rather than 1MB, like the > > quick-fix mentions) erase block size. So, for me, I need alignment on 2MB > > boundaries. > > Do you have a link to any documentation confirming that it's using > a 2MB erase block size? > > All documentation I've found for ssd drives indicate they use from > 16 to 512 kb erase blocks. Newer hard drives are using 4kb sector > physical sector sizes. > > The 1MB alignment was chosen, as all of the erase block sizes, and > 512 byte, and 4kb sectors will all fit an exact number in a 1MB area. Some manufacturers seem to be very tight-lipped regarding SSD erase block sizes, but here are a couple examples with the OCZ hardware. I actually have the Agility3 unit as opposed to the Vertex3, btw. http://superuser.com/questions/492084/is-alignment-to-erase-block-size-needed-for-modern-ssds http://www.ocztechnologyforum.com/forum/showthread.php?95819-Block-Sizes-Vertex-3
I have an OCZ-AGILITY4. hdparm -i shows Logical Sector size: 512 bytes Physical Sector size: 512 bytes The forum posts seem to be mixing up 2kb, 2mb, 2048 sectors (which is 1MB), etc. I'd really like to see an actual specification sheet confirming that it's using a 2MB erase block size. If the drive really does need 2MB alignment, anyone using gparted, diskdrake, or windows 7, with the defaults, will have problems, as they all use 1MB alignment by default.
(In reply to Richard Houser from comment #33) > Some manufacturers seem to be very tight-lipped regarding SSD erase block > sizes, but here are a couple examples with the OCZ hardware. I actually > have the Agility3 unit as opposed to the Vertex3, btw. Can you run a test on it? Create a partition starting at sector 2048 (aka 1MB aligment), and a second partition starting at a sector that is a multiple of 4096 (aka 2MB alignment), Create a test file in ram with dd if=/dev/urandom of=/dev/shm/test.data bs=1M count=64 Then test the write speed with time dd if=/dev/shm/test.data of=/path/to/partition/test.data for both partitions and report the results here.
(In reply to Dave Hodgins from comment #34) > I have an OCZ-AGILITY4. hdparm -i shows > Logical Sector size: 512 bytes > Physical Sector size: 512 bytes > > The forum posts seem to be mixing up 2kb, 2mb, 2048 sectors (which > is 1MB), etc. > > I'd really like to see an actual specification sheet confirming that > it's using a 2MB erase block size. > > If the drive really does need 2MB alignment, anyone using gparted, > diskdrake, or windows 7, with the defaults, will have problems, as they > all use 1MB alignment by default. Unfortunately, you can't trust the sector information coming from hdparm. So many drives have been configured to lie for the sake of Windows XP, and someone had the bright idea to misreport the physical size, too. For example, I have five different models of western digital hard drives with 4Kb physical sectors, and all but two report 512b to the machine. I don't see why SSDs would be different. I've tried a few times to get the official erase block size from OCZ, but their reps just gave me some BS about it being a trade secret, etc.
(In reply to Dave Hodgins from comment #35) > (In reply to Richard Houser from comment #33) > > Some manufacturers seem to be very tight-lipped regarding SSD erase block > > sizes, but here are a couple examples with the OCZ hardware. I actually > > have the Agility3 unit as opposed to the Vertex3, btw. > > Can you run a test on it? > > Create a partition starting at sector 2048 (aka 1MB aligment), and a second > partition starting at a sector that is a multiple of 4096 (aka 2MB > alignment), > > Create a test file in ram with > dd if=/dev/urandom of=/dev/shm/test.data bs=1M count=64 > > Then test the write speed with > time dd if=/dev/shm/test.data of=/path/to/partition/test.data > for both partitions and report the results here. I don't think that would accomplish anything. Unlike a hard disk, there is the additional erase process to deal with. So, normal writes happen at a page level, and erases happen at a higher level. In this case, there seems to be an online consensus that the erase block for this SSD generation is 256 pages. As long as the drive (not just the OS) has free space available, writes should happen at optimal speed anytime the write matches an even multiple of the page size. So, 4KB writes (for example) should happen at the same speed in either partition. To empirically test this (something I'm not prepared to do at the moment), I think this would work a bit better: 1.) Zero the entire drive. 2.) Pass the appropriate TRIM commands to the drive to let the hardware physically erase the data. This should take a long time to run. 3.) Create the partitions you specified. 4.) Format each with the same filesystem supporting TRIM (ext4, for example) However, you would need to ensure online TRIM is not enabled. 5.) Write a very large number of 2MB (needs to be exact) files to each partition (fill the partitions). 6.) Randomly select, and delete a small fraction of files for the first partition (exactly 10%, perhaps?). 7.) fsync 8.) Run a timed, batch trim operation on the drive. 9.) Repeat step 6, only for the other partition. Delete the exact same number of files. 10.) fsync 11.) Run a timed, batch trim operation on the drive. 12.) Go back to step #5 a total of two more times, to make sure the numbers are consistent. Analysis... If the trim command runs a LOT faster for the 1MB alignment, I think that means the drive didn't actually complete the TRIM, due to those erase blocks still containing data. If they took the same amount of time and extremely fast, I think that means the trim failed for both cases and the alignment is too small. If they took the same amount of time and much slower, I think that means the trim succeeded for both cases, and thus either alignment works. It all comes down to the page size. If the page size is 2KB (from 2007 - ex. google "typical mlc page size" and pick the micron link), the erase block is 512KB. If the page size is 8KB (like ours? - ex. http://www.anandtech.com/show/6432/the-intel-ssd-dc-s3700-intels-3rd-generation-controller-analyzed/2), we're actually looking at 2MB alignments. An article at http://www.anandtech.com/show/6388/intel-ssd-335-240gb-review references an in-development Intel chip that will use 4MB erase block sizes, btw.
Changing summary back to the original one and closing as fixed, sorry for having contributed to mixing issues in this report. See https://ml.mageia.org/l/arc/doc-discuss/2013-04/msg00037.html (click this link again after confirming you're not a spammer) I'll try to open the enhancement request for diskdrake to allow the user to specify the alignment to use, later today (please ping me if I forget) (In reply to Thierry Vignaud from comment #22) > BTW, I think this migh break windows booting after resizing (two people > reported to me windows failed to boot after resizing it for installing > Mageia) @ Thierry, if you can get more information about this, can you then please open a new bug report for it?
Status: NEW => RESOLVEDResolution: (none) => FIXEDSummary: Resizing Windows partition on SSD might lead to unbootable windows. => Partitioning issues with 4k drives ("advanced format")Whiteboard: 3beta1 (Mga2) => (Mga2)
I realize this bug is now closed, but I found references to the new 19nm keeping the same 2MiB settings, so including here in case alignment needs to be revisited later. http://www.anandtech.com/show/4284/sandisktoshiba-take-back-the-crown-with-a-different-kind-of-nand I'll verify the default partitioning when 5-beta2 drops, and open a new bug with reference if required.