Performance tests of VSANS: what is the attitude of your cluster under actual stress?

Our config node VSAN

Reference Dell R730xd

768 GB OF RAM

PERC730

ability level: 21 x SAS disks 1, 2 TB

the caching level: 3 x SANDISK SX350-3200

dedicated NIC for VSAN 10GE

We encounter problems while driving a large number of virtual machines on our cluster which themselves generate a large number of the IOPS / s (similar to IOmeter).

We are faced with questions:

-guests are cranked out vcenter

-vmx files have been corrupted

This does not mean that if we only use one virtual machine host to generate the load. In this case, these are our observations:

% Of reading

Write %

The KB block size

Oustanding IO

GB of file size

# Workers

% Randomly

Sequential %

FTT

# Components

read a book

IOPS / total s

Totl Mbps

AVG ms Lat

45

55

4

5

10

4

95

5

1

6

0

8.092

33,15

0.6

45

55

4

5

10

4

95

5

1

12

0

10.821

44,33

0.46

45

55

4

5

10

4

95

5

1

6

100

12.611

51.65

0.39

45

55

4

5

10

4

95

5

1

12

100

11,374

46,59

0.43

45

55

4

64

10

4

95

5

1

12

100

29.746

121,8

2.15

100

0

4

64

10

4

100

0

1

12

0

50.576

207,1

1.26

100

0

4

64

10

4

100

0

1

12

100

50.571

207,1

1.26

100

0

0.512

64

10

4

0

100

0

1

100

67.330

34,47

3.8

If these numbers seem very good, Don't they?

From our point of view, it would be ok if latency increases when you increase the load with multiple virtual machines. But as already out doubled over our cluster becomes broken after increasing the number of virtual machines at some point.

We are currently testing with a subset of four nodes VSAN with the above config. In this group, we were able to turn on 187 load generation machines before that three of the four hosts entered the State "is not responding.

We wonder if anyone has also made large performance tests. If so, we would be very interested in comments you make. Maybe this helps us to find the fault of our construction.

Kind regards

Daniel

With the help of VMware we discover that the problems we face are caused by virtual machines gen load... each vm is generating 2048 ios in circulation... so with 187 VM on the cluster, we unreal ios in circulation (382976... when I remember the graphic observation of our cluster it would reduce up to 13000 per host) which we will never see in real world scenarios. With this high traffic vsan 6.1 ios has problems and vmware is working on a solution... maybe with 6.2 and qos that is resolved?

A few other changes:

-a newer driver for our network cards intel x 710

-new driver and firmware (beta version of dell) for our h730 perc controller

and some advanced settings:

esxcfg-advcfg - 100000/LSOM/diskIoTimeout s

esxcfg-advcfg - s 4/LSOM/diskIoRetryFactor

esxcfg-advcfg - s LSOM/2047/heapSize (with these parameters, we were able to create 3 starts - with 7 metal discs and 1 flash device - for each host)

Tags: VMware

Similar Questions

  • What is the best for your laptop: shutdown, standby, or hibernation?

    Original title: 'stop'?

    What is better for a laptop computer, stop every time, for sleep or Hibernate. It hurt to stop all the time.    Thank you Windows Vista

    Hello

    Better to refer to the manufacturer of your system Support, their books online and their
    Forums about your model (with a grain of salt), because it will be the difference between the marks
    and even some models.

    Usually there is no exact answer but its best not to put the laptop off the coast and back
    in a field so it can be avoided. Shut down the computer using also therefore be able to short
    periods (several hours) using sleep or Hibernate would better save energy. Also allowing
    Cool and heat Electronices frequently is not the best for their long term use and
    life expectancy.

    Using power properly Plans with the right parameters of standby and Hibernate, which
    depends on how you use the computer, get the best performance and service life
    There is however no exact answer. Those who seem to be best for the way you use the
    system will probably also be better in the long term. If you "could" set with the magical settings
    that could optimize the functions of machinery and life, but then to use the machine in another
    model, as we all wish, then in fact you will get actually less usage and life
    If your usage patterns.

    One thing that many people forget, is that turn off the screen can also save power when you
    you are away a few moments and wouldn't turn the computer off. This will save
    some food through the use of a screen saver, but can be difficult to remember and does not record
    A lot.

    Check with your manufacturer for system and their forums as your laptop might already have this
    capacity.

    How to manually turn off notebook or laptop LCD - utility screen free
    http://www.Raymond.CC/blog/archives/2008/07/20/how-to-manually-turn-off-notebook-or-laptop-LCD-screen/

    Turn off your notebook LCD with one click - author's website
    http://www.RedmondPie.com/turn-off-your-notebook-LCD-with-one-click/

    I hope this helps.

    Rob Brown - MS MVP - Windows Desktop Experience: Bike - Mark Twain said it right.

  • What is the size of your recovery Partition?

    Hi all

    I am running with active FileVault 10.9.5 (mid 2014 retina MacBook Pro 2.2 GHz, 16 GB RAM, stock Apple SSD). At one point, I had a 650 MB recovery Partition on my internal SSD (I'm sure it was a previous iteration of 10.9). At one point, he went to 1.03 GB. Was - it due to an update or maybe something wrong? So I would like to know what size your RP is if you run 10.9.5.

    Also, I have several Carbon Copy Cloner backup on external hard drives; We started as a SuperDuper! backup and it has a 1.30 GB RP (exactly 2 x the size of a RP 650; a coincidence?). Another also began as a backup of the SD was changed to a backup of the CCC and has a PR of 650 MB. Another was created from the beginning using the CCC and its creator RP; its size is 784,2 MB. So, I want to see if one is 'correct' or if it is normal to have different sizes of RP.

    To find the size of your RP, go to utilities > system information > material > SATA/SATA Express and look down where it says "Recovery HD."

    Any help is greatly appreciated.

    Thank you!

    3: Apple_Boot Recovery HD 650.0 MB disk0s3

    It's the recovery partition on my system of El Capitan clean install.  I used 'diskutil list' for this information

  • Functions and deterministic performance of gain.  What is the reality behind?

    Hello Pros,

    What is that we really can win using the deterministic function. I've seen some good examples of askthom and discussions. After reading these examples, I feel deterministic functions are rarely used and the benefits are neglected. Is my thoughts right? Requst the pro s to discuss on how used this deterministic function or how better can be used in the right places to take advantage of the performance.

    [url http://asktom.oracle.com/pls/apex/f?p=100:11:0:P11_QUESTION_ID:1547006324238 #12928321943595] Example site asktom

    Here is the link I had called and taken a few examples out of it.

    Thanks in advance

    You can also see the CBO in action (optimize SQL code and do not stuff that is not necessary to determine the final result).

    Think the feature is actually executed when you use a SQL as follows? :-)

    SQL> exec global.counter := 0;
    
    PL/SQL procedure successfully completed.
    
    SQL> select count(*) from (select SpecialFoo(level/level)  as X from dual connect by level <= 10000);
    
      COUNT(*)
    ----------
         10000
    
    SQL> exec dbms_output.put_line( 'execution='||global.counter );
    execution=0
    
    PL/SQL procedure successfully completed.
    
    SQL> 
    
  • What are the problems preventing a download under encryption

    I get this error message 'a problem is preventing this folder is encrypted' when I download the firmware for the redsn0w jailbreak... Verson 4.3.1 for ipod 4

    Hello

    1. what browser do you use?

    If you use Internet explorer, then try resetting Internet explorer and check to see if it helps:

    http://Windows.Microsoft.com/en-us/Windows7/reset-Internet-Explorer-settings

    It will be useful.

  • What is the link state Power Management under PCI Express power management Options list

    What is link State Power Management under PCI Express power management Options list.

    What is link State Power Management under PCI Express power management Options list.

    Hello

    The PCI Express Link State Power Management option is part of the PCI-E specification and is working with the State active power management (ASPM) in Windows 7.

    It is a complex subject, but can be simply described as follows.

    There are 2 levels of power management in PCI Express options.

    The difference between these 2 options are the energy savings compared to the latency (time to recover from sleep state).

    If you select the first option, moderate energy savings, energy savings are lower, but the time to recover from sleep state (latency) is much shorter.

    If you select Maximum energy savings, energy savings are more important, but the time to recover from sleep state (latency) is much longer.

    I hope this helps.

    Thank you for using Windows 7

    Ronnie Vernon MVP
  • What is the legend of feedback coming under objects quiz in the style of the object manager?

    Configure Manager object style for a new project in Cp 9 and I realized that I have no idea what the "Advance Feedback Caption/SmartShape" see items under objects of quiz. Can someone clarify the situation? I don't see nothing that calls on any slide master quiz.

    Thank you

    Jenny N

    The advanced Feedback captions can only be configured for (radio button) Multiple choice quiz questions and any other type.

    The option to enable appears them ONLY when you select one of the options of response to a multiple choice quiz question and then go to the Properties tab > Options section.  If you check the box to answer Advanced Option then each individual response can have a caption comments appear if the learner chooses this response.  This allows you to customize the feedback based on the answers selected individually instead of just to the whole question quiz.

    But as said, there is only one type of quiz question which offers this option.

  • What is the status of your order?

    Ordered my pure Moto X yesterday, about an hour after that pre-orders have started. Today, I went into my email and checked the status of my order. According to the bike order status tracking site, my bike was done and my 'order is finished. If this is true, I guess to be shipped quickly.

    Excellent work, Motorola.

    Hate to be depressing, but your best advice is the shipping date that you were provided when you placed your order.

  • What is the fourth control button in the upper right window of 4 FF?

    I had to reinstall FF4 during cleanup of virus and there is now a 4-button control displayed in the upper right of the window. From the right, there are the usual 'close' (red), 'Restore down' and 'Minimize' buttons (the two blue), and then there is a fourth, the green button, with no 'ToolTip' that are associated with. What is the attitude of this new button?
    I'm not normally this curious, but I had just a fight against malware 'Antivirus Antispam 2011' and am very sensitive to the new graphics that I don't recognize.
    I am running XP Pro.
    Thank you!

    You can post a screenshot of it?

  • What is the best switch to deploy, SG300-52 or SGE2010?

    Hi guys,.

    I am currently canvassing of switches that can be deployed in our temporary buildings as Distribution switches. I just want to ask your recommendation between the SG300-52 and SGE2010. What is the best in your experience? Features interested is the preparation of ipv6 of the switch, its features and performance (speed of transfer, LACP, stability, etc.)

    Thanks in advance!

    Zero

    Zero,


    Far the sg300-52, you have these functions plus text view cli coming in the next version of the firmware.  It supports more features and it's the next step for small businesses of cisco switches.  The sge2010 was the best of this last era, but in functionality, the sg300-52 summits.  Not to mention, you get your real 48 ports for connections and 4 additional ports for fiber or stacking modules that will be supported in the next firmware.

  • What is the meaning of this statement.

    Of http://docs.oracle.com/cd/E11882_01/server.112/e16638/optimops.htm#autoId34, there is not that I can't understand.
    If the path of the inner table is independent of the external table, then the same lines are retrieved for each iteration of the outer loop, significantly reduce the performance of the.
    What is the meaning of this statement? You can take an example for me?

    Thanks
    Lonion

    >
    Of http://docs.oracle.com/cd/E11882_01/server.112/e16638/optimops.htm#autoId34, there is not that I can't understand.

    If the inner table's access path is independent of the outer table, then the same rows are retrieved for every iteration of the outer loop, degrading performance considerably. 
    

    What is the meaning of this statement? You can take an example for me?
    >
    Can you say: join Cartesian?

    This quote is from the section explaining the nested loops. and note it gives you a clue:
    >
    See also:

    "Cartesian joins.
    >
    The sentence BEFORE the one you quoted, it is what connects your quote with the MENTION:
    >
    It is important to ensure that the internal table is driven out of the external table (function).
    >
    This statement means that the lines of the internal table should DEPEND ON the external table.

    In a Cartesian join the inner table will depend on the external table at all:

    SELECT D.*, E.* FROM DEPT D, EMP E
    

    There is no WHERE clause, so there is nothing saying Oracle tables are related as well. Oracle will perform a Cartesian join and if a nested loop is used then, as says your quote, "the same lines are retrieved for each iteration of the outer loop, performance degradation significantly."

    All Oracle ranks visits to query external table lines will now be the inner table. But because there is no WHERE clause is no available information to EXCLUDE lines from the internal table "the same lines are extracted" (ALL) "for each iteration of the outer loop.

    Here is the same query above using the USE_NL hint to force Oracle to use a nested loop

    SQL> select /*+ use_nl (d e) */ d.*, e.* from dept d, emp e;
    
    56 rows selected.
    
    Execution Plan
    ----------------------------------------------------------
    Plan hash value: 4192419542
    
    ---------------------------------------------------------------------------
    | Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |      |    56 |  3248 |    10   (0)| 00:00:01 |
    |   1 |  NESTED LOOPS      |      |    56 |  3248 |    10   (0)| 00:00:01 |
    |   2 |   TABLE ACCESS FULL| DEPT |     4 |    80 |     3   (0)| 00:00:01 |
    |   3 |   TABLE ACCESS FULL| EMP  |    14 |   532 |     2   (0)| 00:00:01 |
    ---------------------------------------------------------------------------
    
    Statistics
    ----------------------------------------------------------
              1  recursive calls
              0  db block gets
             42  consistent gets
              0  physical reads
              0  redo size
           3897  bytes sent via SQL*Net to client
            452  bytes received via SQL*Net from client
              5  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
             56  rows processed
    
    SQL>
    
  • best way to test mobile models without owning the devices

    I'm testing web phone and Tablet designs, but cannot afford to buy same devices 3 or 4 and still less the dozens that need to be tested today. What is the best way to get accurate results without buying a dozen of aid. I have a subscription 'CC', but there is apparently nothing to help. There used to be a program called "Device Central" in the creative suites, but it disappeared. I left design 2 years ago because of all the design tests question but would like to start over.  I'm tired of watching the tutorials on the subject, only to find out that they don't work without the devices.  Advice or recommendations would be greatly appreciated. TIA.

    You'll make yourself crazy trying to conceive for each device.

    Create your sites a combination of responsiveness (using queries media to target a small number of the most common devices attached) and liquid (using width percentages calculated for 'steps' in your MQ of transition smoothly on the intermediate devices).

  • What is the name of my best friend from childhood?

    What is the name of your best childhood friend?

    This is the name that I YOU typed in facing the issue of the first time.

  • What is the correct procedure to take a host of a hav?

    Sometimes need to take a host far a hav, what is the correct procedure for it?

    Thank you!

    George

    Disconnect the host in your cluster, and then remove host.  Or evacuate all customers outside the host, put in maintenance mode, and then remove the cluster

  • Can't see the Essbase of SmartView cluster

    Hello
    I created a cluster Essbase on the service provider and added two cubes. Now, I want to use this cluster, but I don't see a SmartView. Could someone tell me please what should I do to get from SmartView please?

    Concerning

    Chandra

    Hello

    OK, V11 is slightly different

    When you click the Add button to Oracle Essbase.

    Product:-Oracle Essbase
    Server name of the product:-the name of your cluster here

    Then the user name and password, when you click ok your cluster should appear.

    See you soon

    John
    http://John-Goodwin.blogspot.com/

Maybe you are looking for