问题是与所提供的样本代码(和缺乏有用样本,通常的,在这个领域...)。另外,为了立即满足(看到混乱发生而没有等待太久......),你需要比文档样本更具攻击性(这又是一种真正的工作,因为你已经发现了......) 。
你是更好的服务使用不同过载为ChaosParameters构造...
试试这个(更换样品代码与此IMPL):
var startTimeUtc = DateTime.UtcNow;
var stabilizationTimeout = TimeSpan.FromSeconds(30.0);
var timeToRun = TimeSpan.FromMinutes(60.0);
var maxConcurrentFaults = 7;
var timeBetweenFaults = new TimeSpan(0, 0, 10);
var timeBetweenIterations = new TimeSpan(0, 0, 10);
Dictionary<string, string> _context = new Dictionary<string, string>();
//Aggressive chaos...
var clusterHealthPolicy = new System.Fabric.Health.ClusterHealthPolicy()
{
MaxPercentUnhealthyApplications = 90,
MaxPercentUnhealthyNodes = 100
};
var parameters = new ChaosParameters(
stabilizationTimeout,
maxConcurrentFaults,
true, /* EnableMoveReplicaFault */
timeToRun,
_context,
timeBetweenIterations,
timeBetweenFaults,
clusterHealthPolicy);
注意:我建议你在新的静态异步任务返回功能中执行此操作...
Full(working)示例:
public static async Task RunChaos()
{
var clusterConnectionString = "localhost:19000";
using (var client = new FabricClient(clusterConnectionString))
{
var startTimeUtc = DateTime.UtcNow;
var stabilizationTimeout = TimeSpan.FromSeconds(30.0);
var timeToRun = TimeSpan.FromMinutes(60.0);
var maxConcurrentFaults = 7;
var timeBetweenFaults = new TimeSpan(0, 0, 10);
var timeBetweenIterations = new TimeSpan(0, 0, 10);
Dictionary<string, string> _context = new Dictionary<string, string>();
//Aggressive chaos...
var clusterHealthPolicy = new System.Fabric.Health.ClusterHealthPolicy()
{
MaxPercentUnhealthyApplications = 90,
MaxPercentUnhealthyNodes = 100
};
var parameters = new ChaosParameters(
stabilizationTimeout,
maxConcurrentFaults,
true, /* EnableMoveReplicaFault */
timeToRun,
_context,
timeBetweenIterations,
timeBetweenFaults,
clusterHealthPolicy);
var token = new System.Threading.CancellationToken();
try
{
await client.TestManager.StartChaosAsync(parameters, new TimeSpan(0, 30, 0), token);
}
catch (FabricChaosAlreadyRunningException)
{
Console.WriteLine("An instance of Chaos is already running in the cluster.");
}
var filter = new ChaosReportFilter(startTimeUtc, DateTime.MaxValue);
var eventSet = new HashSet<ChaosEvent>(new ChaosEventComparer());
while (true)
{
var report = await client.TestManager.GetChaosReportAsync(filter);
foreach (var chaosEvent in report.History)
{
if (eventSet.Add(chaosEvent))
{
Console.WriteLine(chaosEvent);
}
}
// When Chaos stops, a StoppedEvent is created.
// If a StoppedEvent is found, exit the loop.
var lastEvent = report.History.LastOrDefault();
if (lastEvent is StoppedEvent)
{
break;
}
Task.Delay(TimeSpan.FromSeconds(1.0)).GetAwaiter().GetResult();
}
}
}
您是否还可以在集群中添加实体的当前运行状况?如果有任何警告,那么“考虑警告性的错误”,那么混沌就会认为事情不健康,不会移动它们。 – masnider
我曾尝试将“ConsiderWarningsAsError”设置为true,并且已确认所有实体都健康,但每次运行此代码时仍会看到相同的问题。我可以在任何地方看到日志来帮助诊断吗? –
您是否因为使用我的天蓝色服务架构群集发现类似问题而遇到过任何问题? – Kramer00